Get the path of the files changed in each commit and pull requests in gitlab.

Using .gitlab-ci.yml pipeline to get the path of files modified

·

2 min read

While running a pipeline in GitLab, you can make use of pre-defined environment variables to get the path of the files changed in your commit in a single branch or a pull/merge request from one branch to another.

Using git diff commands we can:

  • Get the file diff in a commit in a feature branch. This makes use of the fact that in a single feature branch a commit (os.environ['CI_COMMIT_SHA']) is preceded by another commit SHA which can be referred as os.environ['CI_COMMIT_BEFORE_SHA']. In this case, we use git diff --name-only CI_COMMIT_SHA CI_COMMIT_BEFORE_SHA

  • When creating the pull request from one branch to another (for ex. feature to master/main), the CI_COMMIT_BEFORE_SHA = "0000000000000000000000000000000000000000". In this case, we use git diff-tree --no-commit-id --name-only -r CI_COMMIT_SHA

    Following is a python code snippet using the above commands. Let's name it find_file_diff.py

import subprocess
import os
import json
import sys


print(os.environ['CI_COMMIT_SHA'])
print(f"CI_COMMIT_BRANCH is {os.environ['CI_COMMIT_BRANCH']}")

commandGit = ''

if os.environ['CI_COMMIT_BEFORE_SHA'] == "0000000000000000000000000000000000000000":
    commandGit = 'git diff-tree --no-commit-id --name-only -r '+os.environ['CI_COMMIT_SHA']
else:
    commandGit = 'git diff --name-only '+os.environ['CI_COMMIT_SHA'] + ' ' + os.environ['CI_COMMIT_BEFORE_SHA']

print(commandGit)
p = subprocess.Popen(commandGit, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)
sout,serr = p.communicate()
p.wait()
print('******************************************')
print(sout)
print(serr)
FinalModifiedFiles = sout.split('\n')
print(FinalModifiedFiles)

Let's call this python file in our .gitlab-ci.yml

image: python
stages:
  - print file diff 

printing:
  stage: print file diff
  script:
    - echo "running python"
    - python3 find_file_diff.py

Expected output in the pipeline, when I changed some file

Running with gitlab-runner 15.9.0~beta.115.g598a7c91 (598a7c91)
  on blue-3.shared.runners-manager.gitlab.com/default zxwgkjAP, system ID: s_284de3abf026
Preparing the "docker+machine" executor
00:30
Using Docker executor with image python ...
Pulling docker image python ...
Using docker image sha256:f92346e0c39e6d8ba8c28e9528cc3e6e19df19be2fd733de4d38d6f899648ba5 for python with digest python@sha256:e7e3b031dbf71514d0a8d759d8417b04f8dcf483aec18d69abd1a2b3955297b6 ...
Preparing environment
00:02
Running on runner-zxwgkjap-project-43111362-concurrent-0 via runner-zxwgkjap-shared-1676477717-17705899...
Getting source from Git repository
00:01
$ eval "$CI_PRE_CLONE_SCRIPT"
Fetching changes with git depth set to 20...
Initialized empty Git repository in /builds/cicd53/help-link/.git/
Created fresh repository.
Checking out 8dd538a7 as detached HEAD (ref is main)...
Skipping Git submodules setup
Executing "step_script" stage of the job script
00:01
Using docker image sha256:f92346e0c39e6d8ba8c28e9528cc3e6e19df19be2fd733de4d38d6f899648ba5 for python with digest python@sha256:e7e3b031dbf71514d0a8d759d8417b04f8dcf483aec18d69abd1a2b3955297b6 ...
$ echo "running python"
running python
$ git diff-tree --no-commit-id --name-only -r $CI_COMMIT_SHA
$ files=$(git diff-tree --no-commit-id --name-only -r $CI_COMMIT_SHA)
$ echo $files
$ python3 find_helplink.py
7300ba0c955d1caf27a0232a44db76d42
CI_COMMIT_BRANCH is main
git diff --name-only 7300ba0c955d1caf27a0232a44db76d42 1d244f8ba17ea0e9bdda2452faa4d1c
******************************************

cool_wrapper.py
not_so_cool_wrapper.json
sub-folder/hell-world.html
noice.yaml
['cool_wrapper.py', 'not_so_cool_wrapper.json', 'sub-folder/hell-world.html', 'noice.yaml', '']

Hope it helps someone.!!