Containers
Tracking file changes in workspace

This document outlines the procedure for extracting altered files from a workspace using the Sphere Engine Containers module. It begins by discussing the underlying motivation and potential applications. Following that, it provides a technical reference for the changes_tracker utility. Finally, the guide concludes with a comprehensive discussion that delves into practical examples, offering a thorough understanding of the concept.

Motivation and context

Consider the scenario of an end-user who has executed a scenario within one of your projects. It's common that not all executed code originates from the end-user; typically, only specific expected sections are earmarked for further processing. These segments may necessitate closer scrutiny, such as for plagiarism checks or for display within your service.

Sphere Engine Containers provides a mechanism for extracting files from the workspace. Given the intricacy of this retrieval process, the presented mechanism is endowed with a rich set of features.

Note: End-user files often contain numerous elements that are unnecessary for download and analysis, such as third-party dependencies, cache files, and framework components. Transmitting and storing these would be highly inefficient and needlessly expensive.

Changes tracker

The Sphere Engine Containers module incorporates a powerful internal utility named changes_tracker. In brief, this tool streamlines the process by:

  • generating a singular state file consolidating MD5 hashes of all files within the specified directory,
  • interpreting a designated .trackignore file, enabling users to specify files to be excluded from tracking, such as packages or cache files.
  • facilitating comparison between the current project file states and the information stored in the state file.

Technical reference

The following presents shortened manual of changes_tracker utility.

NAME
  changes_tracker - utility for management of tracking files feature

SYNOPSIS
  changes_tracker [OPTIONS] <COMMAND> [COMMAND_OPTIONS]

  <COMMAND>=track|list-changes|copy

DESCRIPTION
  This Sphere Engine Containers command provides with the tools allowing
  for managing the process of tracking changes in workspace files.

OPTIONS
  -h, --help
    Print usage statement for command

COMMAND_OPTIONS
  -d <directory>, --dir <directory>
    Directory to track.

  -s <state_file>, --state-file <state_file>
    Path to the state file.

  -i <track_ignore_file>, --track-ignore <track_ignore_file>
    Path to the .trackignore file.

  -j --json
    Produce output in JSON format.

  -t <output_directory>, --to <output_directory>
    The directory to which the changed files will be copied.

COMMANDS
  The changes_tracker utility is divided into independent tools. Each command
  serves for a specific purpose.

  Here, you can see an overview for each command.

    track [COMMAND_OPTIONS]
      Configure the files tracking process.

      COMMAND OPTIONS
        -d <directory>, --dir <directory>
          Defines a directory to track.

        -s <state_file>, --state-file <state_file>
          Defines a state file as a reference for comparing files.

        -i <track_ignore_file>, --track-ignore <track_ignore_file>
          Indicates a .trackignore files with the list of files
          that should be ignored by the changes tracker.

    list-changes [COMMAND_OPTIONS]
      Print the list of changed files.

      COMMAND OPTIONS
        -s <state_file>, --state-file <state_file>
          Defines a state file as a reference for comparing files.

        -j --json
          Produce output in JSON format.

    copy [COMMAND_OPTIONS]
      Copy changed files to the pointed directory.

      COMMAND OPTIONS
        -s <state_file>, --state-file <state_file>
          Defines a state file as a reference for comparing files.

        -t <output_directory>, --to <output_directory>
          Defines the destination path for copying files.

EXAMPLES
  • Configure changes tracker to observe the ./project directory, store
    state file in the /sphere-engine/runtime_data/state.json file and
    use the .sphere-engine/.trackignore for a list of files to ignore.

    $ changes_tracker track --dir ./project \
      --state-file /sphere-engine/runtime_data/state.json \
      --track-ignore .sphere-engine/.trackignore

  • Print the list of changed files to the standard output using the
    /sphere-engine/runtime_data/state.json file as a reference.

    $ changes_tracker list-files \
      --state-file /sphere-engine/runtime_data/state.json

  • Copy all modified files to the ./changed_files directory using the
    /sphere-engine/runtime_data/state.json file as a reference.

    $ changes_tracker copy \
      --state-file /sphere-engine/runtime_data/state.json \
      --to ./changed_files

Example

Now, let's examine a sample project designed as a demonstration of the tracking changed files feature. To follow next steps, create a new project using this project as a template. It can be done by the following steps:

  • start with Projects > Create project in the client panel,
  • type any Project name and select the Changes tracker (demo) template and continue with the Next button.
  • enter the project in a role of Content Manager by pressing the Open IDE button.

Project configuration

Now, we are ready to examine the project's structure.

The configuration process of the tracking file changes feature should take place before the workspace starts. The workspace_init script available among project settings is the most suitable place for this. Observe the content of the configuration script by going into Options > Workspace init:

#!/bin/bash

changes_tracker track \
  --dir ./project \
  --state-file /sphere-engine/runtime_data/state.json \
  --track-ignore .sphere-engine/.trackignore

In this case:

  • directory ./project is a subject of monitoring,
  • information about initial state of monitored files is stored in the /sphere-engine/runtime_data/state.json file,
  • files listed in the .sphere-engine/.trackignore won't be monitored.

Note: Remember that the workspace init script runs only when the workspace first starts. If you change it, you will have to restart the workspace. Remember to save the project beforehand!

Note: Files are relative to the workspace directory.

Now, let's check the content of the .sphere-engine/.trackignore file:

project/node_modules
project/yarn.lock
project/ignored.js

Note: The structure of the .trackignore file is similar to well-recognizable .gitignore mechanism; yet, keep in mind that in this case wildcard patterns are not supported.

List altered files

Now, we will use a predefined scenario to see the outcome of the list-changes command of the changes_tracker utility. The scenario is assigned to the List changed files button. But first, we can briefly examine the script of this scenario that is located in the .sphere-engine/list_changed_files.sh file:

#!/bin/bash

echo "The following files have changed:"
changes_tracker list-changes --state-file /sphere-engine/runtime_data/state.json

The only required parameter is --state-file which provides a path to the state file which keeps the information about the initial state of monitored files.

After launching the scenario (by pressing the List changes files) you will see the output in the integrated terminal. Unsurprisingly, the list of modified files is empty. Indeed, we haven't modified any files yet. So, let's modify the projects/app.js file in any way and save the changes. After launching the scenario once again the output contains a single entry:

The following files have changed:

/home/user/workspace/project/app.js

You can also get back to the projects/app.js file and revert the changes. After launching the scenario again, the list of files will be empty once again.

Note: When assessing file changes, the comparison is consistently made against the original content. In instances where a file was initially altered and subsequently reverted to its original version, it is categorized as unchanged in the evaluation process.

Copy altered files

The sample project introduces two additional scenarios: Prepare changed files for export via webhooks and Send changed files to your server. While the latter showcases the webhooks mechanism, which we can bypass for now, the former highlights the final unexplored functionality of the changes tracker utility - facilitating the management of altered files.

First let's look at the content of the scenario script .sphere-engine/export_via_webhooks.sh. Here is the substantial part of it:

#!/bin/bash
(...)
changes_tracker copy --state-file $SE_PATH_RUNTIME_DATA/state.json --to ./changed_files

The copy command of the changes_tracker tool evaluates the list of changed files (once again by using the state file /sphere-engine/runtime_data/state.json as a reference). All altered files are copied to indicated directory, which in this case is ./changed_files.

Make sure that any monitored file in the ./project directory was actually modified (e.g., ./project/app.js) and launch the scenario by pressing the Prepare changed files for export via webhooks. After that, refresh the project's directory tree on the left-hand side and validate the content of the ./changed_files directory. It contains all the altered monitored files. You can expect the following:

.
└── changed_files
    └── app.js

Ultimately, let's explore the outcome of modifying one of the files listed in ./sphere-engine/.trackignore, such as ./project/ignored.js. After making and saving changes to this file, attempt to run both List changed files and Prepare changed files for export via webhooks scenarios to observe that this specific file is neither included in the list of changed files nor in the ./changed_files directory.