Validation

(Production)

Timezone

Verify that``git``’s Unix timestamp match the current timezone.

$ git commit -a
$ date
dimanche 11 novembre 2018, 16:13:15 (UTC+0100)

$ git log --after "2018-11-11 16:00:00" --before "2018-11-11"
...
Date:   Sun Nov 11 16:11:13 2018 +0100

$ ./keepcool.py -r . --after "2018-11-11 16:00:00" --before "2018-11-11" -v
...
Date(1541949073): Sunday 2018-11-11 16:11:13 -> ['week-end']

Branches

Check on a repository having several branches.

$ mkdir linux
$ cd linux
$ git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
$ git branch
* master

No branches on linux sources, I guess I miss something... (but it seems to me that git log returns by default the commits for all branches.)

Performances

Use a profiler tool on a udge (800.000 commits) repository.

$ ./keepcool.py linux/ -s 'sum'
...
 Mauro Carvalho Chehab: 29462 /   48%
    Greg Kroah-Hartman: 74367 /   70%
        Linus Torvalds: 75151 /   68%
       David S. Miller: 77947 /   74%
-------------------------------------
                  all: 796965 /   54%
-------------------------------------

$ python -m cProfile -o linux.profile keepcool.py linux
$ sudo pip install cprofilev
$ cprofilev -f linux.profile
cProfileV]: cProfile output available at http://127.0.0.1:4000

13494147 function calls (13492172 primitive calls) in 19.650 seconds
Ordered by: internal time

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
746797    8.571    0.000    8.571    0.000 {built-in method poll}
796965    2.025    0.000    5.867    0.000 keepcool.py:252(add_commit)
796965    1.392    0.000    1.392    0.000 {built-in method fromtimestamp}
796965    1.053    0.000    2.446    0.000 keepcool.py:27(__init__)
     1    0.907    0.907   17.516   17.516 keepcool.py:190(parse_logs)
     1    0.693    0.693    9.933    9.933 /usr/lib/python2.7/subprocess.py:1122(_communicate_with_poll)
746798    0.621    0.000    0.621    0.000 {posix.read}
796965    0.521    0.000    1.565    0.000 keepcool.py:74(belongs_to_workin_hours)
796965    0.499    0.000    2.945    0.000 keepcool.py:92(__init__)
...
796965    0.290    0.000    0.290    0.000 keepcool.py:230(get_user)

We can see that the longest times spent into functions (add_commit...), are only seven time bigger than a unique dictionary access (get_user). By the way, we should better optimise the model than the code in order to be quicker.

Memory footprint

Check the memory usage while running.

$ while /bin/true; do sleep 2; ps -e -ovsz -orss,args= | grep [k]eepcool; done
 25184 20196 python ./keepcool.py linux
 40384 35372 python ./keepcool.py linux
 55532 50536 python ./keepcool.py linux
 70876 65832 python ./keepcool.py linux
 87572 82612 python ./keepcool.py linux
533500 528136 python ./keepcool.py linux
653648 648432 python ./keepcool.py linux
771196 766052 python ./keepcool.py linux
376008 371328 python ./keepcool.py linux

$ while /bin/true; do sleep 2; ps -e -ovsz -orss,args= | grep [g]it; done
233292 180476 git log --encoding=UTF-8 --pretty=%H; %cN; %ct
349572 269532 git log --encoding=UTF-8 --pretty=%H; %cN; %ct
434652 356148 git log --encoding=UTF-8 --pretty=%H; %cN; %ct
500564 410912 git log --encoding=UTF-8 --pretty=%H; %cN; %ct
524036 462844 git log --encoding=UTF-8 --pretty=%H; %cN; %ct

The script does not use too much memory compare to the git log query it launches.