Difference between revisions of "ReactOS Git For Dummies"

From ReactOS Wiki
Jump to: navigation, search
(Performing quick fixes (a.k.a. Working directly into the master branch of the main repository))
m (Workflow)
Line 94: Line 94:
  
 
It should go without saying, but '''NEVER EVER FORCE-PUSH TO MASTER'''. Gladly, this is disabled on the server.
 
It should go without saying, but '''NEVER EVER FORCE-PUSH TO MASTER'''. Gladly, this is disabled on the server.
 +
  
 
=== Working on a branch ===
 
=== Working on a branch ===
Line 130: Line 131:
 
* Using TortoiseGit:<br>Right-click the clone folder → TortoiseGit → Push → Check ''Force: May discard known changes'' → OK
 
* Using TortoiseGit:<br>Right-click the clone folder → TortoiseGit → Push → Check ''Force: May discard known changes'' → OK
 
* Using command line: <pre>git push --force-with-lease</pre>
 
* Using command line: <pre>git push --force-with-lease</pre>
 +
  
 
=== Applying an old patch from SVN ===
 
=== Applying an old patch from SVN ===

Revision as of 16:20, 5 October 2017

Introduction

Basic Git concepts

Lots of information here, take it slowly!

  • Commit: A commit in git contains information about a set of changes to the code. It represents the same information you could see in a patch/diff file, but in an internal binary representation. Notably, a commit has some values we care about:
    • Hash: A SHA1 (for now) hash computed from all the data and metadata included in the commit.
    • Message: The first line, or 80 characters of the first line, are considered the summary, and the rest is usually shown separately.
    • Author: The name+email of the person that wrote the patch.
    • Committer: The name+email of the person that applied the patch.
    • Parent(s): In a standard commit, there will be one parent, represented as its commit hash. In a merge commit (see below), it would have two parents.
  • Branching: A git branch is merely an entry in the repository that has a HEAD property, pointing to the commit hash of the newest commit in the branch. The commit operation implicitly assigns the new hash to the branch’s head. This gives git an extremely fast branching system, turning them into a very effective tool for managing contributions.
  • Merging: When two branches are merged, a new commit is created, which contains all the data necessary to apply the changes from one branch, onto the other. This is represented in the metadata by a commit with two parents. Parent 1 represents the primary source of code, and parent 2 the “other” branch that is being merged. Git does not distinguish between merging from master into a work branch, or from a work branch to master, in both cases a merge is “A=A+B” with A being the currently checked out branch, and B the chosen target to merge from. The downside of merge commits is that they can “dirty” the history, when merge commits pile up one after another.
  • Commit: A git commit encodes the modified files, along with a commit message, and authorship information, into the internal representation. The commit’s hash is then assigned to the branch head in the local repository.
  • Clone: In git, everyone works based on a full copy of the history. The action of creating this copy from a remote repository into your computer, is called cloning.
  • Remotes: A git remote tells git about a repository in another computer (for most of us, this will be GitHub’s servers), where you will be fetching from and/or pushing to. You can have as many remotes as you need, and with any names you want, but the most commons are named “origin” for your personal fork, and “upstream” for the repository that you are contributing to.
  • Fetch: In order to retrieve new commits from a remote, the fetch operation computes what data you are missing, and sends back all the compressed objects and metadata you need in order to replicate the rest of the history. This data is accessible from special branches, with names such as “remotes/origin/master”, where “origin” is the name of the remote you assigned when setting up the repository, and “master” is the name of the branch in the remote repository.
  • Checkout: Checking out is the operation of extracting the data at the state represented by the given branch/commit, and making it available in the working copy. Unlike SVN, git does not like performing a checkout with modified files, and stashing is often required.
  • Stashing: Git has the ability to save local modifications as patches in an internal “stash”. It is a common thing to stash changes before you update the working copy, and “pop” from the stash when you are done, to return to the previous state or apply the changes to a different branch.
  • Pull: You will inevitably come across tutorials and examples using the pull command. This is a combined utility command that performs a fetch operation of the selected branch, and immediately after, a merge operation from the remote branch, to the selected branch. We don’t recommend using pull, but rather to use fetch and rebase.
  • Push: The reverse operation to pull. It computes the required data and sends it to the remote repository, subsequently updating the remote’s HEAD to match your local HEAD commit hash.
  • Fast-forwarding: A push operation is considered to be fast-forwarding, if it adds commits following the current HEAD, and non-fast-forward if the history does not fully contain all the commits from the target.
  • Rebase: In the normal workflow of git, updating the state of a branch implies merging from another branch. There are cases, however, when we do not want to use a merge commit, but rather we want to re-apply all the changes on top of the new updated code. The rebase operation does exactly that: it computes the list of commits that have been added since the branches split off, and applies all of them on top of the target commit. Because the metadata will change, a rebase operation is considered to “rewrite history”. Because of that, rebasing is normally considered an advanced operation, and can have side-effects that may be harder to grasp for a beginner. However, we’ll be using it extensively, so it is important to understand the idea.
  • Force-pushing: Git normally rejects commits that are non-fast-forward. If we amend, rebase, or otherwise modify the commit data or metadata of commits that have been previously pushed to a different branch or repository, we are implicitly generating an “alternate history” of those changes. Git will then refuse to accept those modifications, as they could potentially lose data, and hence is considered dangerous. Git provides a “--force” setting, which disables all of those checks. Because “--force” is extremely dangerous, it’s highly discouraged to use this flag. However, in order to effectively use the rebase option, we do need to force-push commits, when working with our own work branches (force-pushing to the master branch of our main repository is strictly forbidden , and reserved only for administrative actions). In order to lessen the danger of the force-push operation, there is an alternative flag, namely “--force-with-lease”, that conditionally allows the force-push, but only if the remote state matches the local cache.
  • Pull Requests: When contributing as an external contributor, or when working with large change sets, spanning many commits, or with sensitive edits, we want to be able to review the changes before they are applied. In GitHub, as with most other repository hosts, a pull request offers this service. A pull request gives other developers a chance to comment, request changes, and eventually approve or reject those changes. Once the changes have been reviewed, and sufficiently approved (the number of approvals is at the discretion of the pull request’s author -- or assignee if the author is not a team member), the Merge button allows integrating those changes into the master branch.

FAQ

  • You spoke about parents, but what about children, which is the “next commit” for a given hash?
    • Git does not have any information about the children in the commit graph, in order to obtain the list of children (branching implies there can be many), a slow walk through all the branch histories is needed. The main method to analyze the history is to open the commit log on your favorite Git GUI tool, and locate your commit in there. Because this can sometimes be annoying, we have an alternative way through the GetBuilds site, which allows navigating back and forth through the builds, giving something that approximates the linear history of SVN.


Workflow

Basic rules that we should try to follow

  • Prefer working on a branch, and pushing to your personal fork.
    • Except for small commits that can be safely pushed to master, if you feel they are safe and you are part of the team
  • Avoid merge commits
    • This implies using rebase to ensure that all history remains linear
  • Keep a summary of the commit on the first line
    • Example:
      [NTOSKRNL]: Update exports to NT6.3
      * Added some stuff
      * Removed some stuff


Cloning the repository

These are the recommended steps to set up a local clone:

  1. Go to https://github.com/reactos/reactos and click the Fork button there to create a personal fork of the ReactOS repository.
    This one will host your work branches and will be the source of your Pull Requests. It will be available publicly in a URL like https://github.com/<yourusername>/reactos.
    • I will be calling them personal fork to distinguish them from a project-wide fork such as OpenOffice vs LibreOffice. A personal fork is just a personal space to make changes in, with the intention of contributing them back upstream.
    • Creating forks on other sites is possible, but you won't have the pull request feature, so not recommended.
  2. Clone https://github.com/<yourusername>/reactos into your computer.
    • Using TortoiseGit:
      Right-click somewhere → Git Clone → Enter URL and Directory → OK
    • Using the command line:
      git clone https://github.com/<yourusername>/reactos
  3. Add https://github.com/reactos/reactos as a remote named upstream, so that you can fetch latest changes from the main ReactOS repository.
    • Using TortoiseGit:
      1. Right-click the clone folder → TortoiseGit → Settings → Remote
      2. Enter upstream for Remote: there and https://github.com/reactos/reactos for URL:
      3. Then click Add New/Save and answer any appearing questions with No
    • Using the command line:
      git remote add upstream https://github.com/reactos/reactos

Now you have a local clone with two remotes:

  • origin - Your personal fork, where you can work in branches
  • upstream - The main ReactOS repository


Updating your environment with latest code

This is how you get the latest code set up before you start working on something new:

  1. Checkout your local master branch (if you were working on another branch)
    • Using TortoiseGit:
      Right-click the clone folder → TortoiseGit → Switch/Checkout → Select master as Branch: → OK
    • Using the command line:
      git checkout master
  2. Fetch the latest commits from the main ReactOS repository and apply them on top of your local master branch:
    • Using TortoiseGit:
      1. Right-click the clone folder → Git Sync
      2. Select upstream as Remote: there and master as Remote Branch:
      3. Click the Fetch & Rebase button (it's hidden behind the arrow next to the Pull button)
    • Using the command line:
      git fetch upstream
      git rebase upstream/master


Performing quick fixes (a.k.a. Working directly into the master branch of the main repository)

These are the steps one would follow during normal development of a “quick fix” (without a work branch). It is recommended to use a work branch for anything that is more than a trivial change:

  1. Ensure your local master is up to date following the instructions above.
  2. Do the work. Commit early, commit often (locally).
    • You can later use TortoiseGit's combine into one commit, or commandline git's interactive rebase, to reduce the number of total commits you will push later, so don't worry about spamming the log!
  3. Commit the final touches.
  4. Fetch the latest code again following the instructions above, and use the rebase method to make sure your changes are applied on top of the latest master code.
    • THIS IS VERY IMPORTANT! ALWAYS REBASE BEFORE PUSHING TO MASTER!
  5. This is your last chance to use interactive rebase, or TortoiseGit's log window, to cleanup your commits until you are satisfied with the resulting patch set.
  6. Push to upstream
    • Using TortoiseGit:
      Right-click the clone folder → TortoiseGit → Push → Select master as Remote: under Ref and upstream as Remote: under Destination → OK
    • Using the command line:
      git push upstream
  7. If the push operation complains about non-fastforward commits, it means someone else pushed something in between your fetch and your push, so you will have to go back to the fetch+rebase steps and try again.


It should go without saying, but NEVER EVER FORCE-PUSH TO MASTER. Gladly, this is disabled on the server.


Working on a branch

These are the steps one would follow to work on a personal branch, backed by your personal fork, on something that can't be described as a “quick fix”:

  1. Ensure your local master is up to date following the instructions above.
  2. Create the branch with a name representative of the work you will be doing
    • Using TortoiseGit:
      1. Right-click the clone folder → TortoiseGit → Create Branch → Enter a name → OK
      2. Right-click the clone folder → TortoiseGit → Switch/Checkout → Select your new branch → OK
    • Using the command line:
      git branch <branch-name>
      git checkout <branch-name>
  3. Do the work. Commit early, commit often (locally).
    • You can later use TortoiseGit's combine into one commit, or commandline git's interactive rebase, to reduce the number of total commits you will push later, so don't worry about spamming the log!
  4. Push to your personal fork (Remote: origin) whenever you want to back things up online. This is optional, but recommended!
    • The first time you push to your personal fork, you have to tell it explicitly about the new branch:
      • Using TortoiseGit:
        1. Right-click the clone folder → TortoiseGit → Push
        2. Leave Remote: in Ref blank and select origin as Remote: in Destination
        3. Check Set upstream/track remote branch
        4. Click OK
      • Using the command line:
        git push --set-upstream origin <branch-name>
    • Subsequent pushes will be easier:
      • Using TortoiseGit:
        Right-click the clone folder → TortoiseGit → Push → OK
      • Using the command line:
        git push
  5. Commit the final touches and push the result to your personal fork.
  6. Navigate to your personal fork on GitHub, and in your branch, choose Create pull request
    • If the push was done very recently, it will show up on the main page of your fork
  7. Take a final look, and maybe ask others to review the code and give their approval.
    • Getting things reviewed is highly recommended, specially for bigger changes.
  8. Once you are positive about your changes, press the Merge with rebase button that GitHub provides.


If the merge button isn't enabled, it probably means GitHub can't safely rebase your changes. In that situation, you first have to rebase manually as described in #Updating your environment with latest code. Then force-push into your personal fork

  • Using TortoiseGit:
    Right-click the clone folder → TortoiseGit → Push → Check Force: May discard known changes → OK
  • Using command line:
    git push --force-with-lease


Applying an old patch from SVN

We have plenty of patches in JIRA, and in our HDDs, that were created from SVN, or at least for svn.

Although Git is perfectly capable of generating and applying patches, its primary patch format contains more metadata that SVN patches are lacking. Most notably, the authorship information. To work around the differences in the patch format, alternative methods need to be used to get the patch applied.

Here are the steps needed to apply an old patch:

  1. Make sure your environment is updated, and an appropriate branch is in place, as described in the previous sections.
  2. Use the general-purpose patch program (instead of git apply) or a GUI patching utility to apply the changes.
    • Using TortoiseGit:
      Right-click the patch file → TortoiseGit → Review/apply single patch
    • Using the command line:
      patch -p1 < patchfile.patch
  3. Review the changes and write an appropriate message.
  4. Perform the commit with a custom author
    • Using TortoiseGit:
      Check the Set Author checkbox in the Commit dialog and enter the Name and E-Mail Address of the patch author.
    • Using the command line:
      git commit --author "Name <email@address.com>"
  5. Rebase and push as needed.

Slightly more advanced topics

Amending commits

If you notice a mistake in your last commit, Git allows you to amend it:

  1. Do the file changes that should have been included in the last commit.
  2. Amend the last commit:
    • Using TortoiseGit:
      Right-click the clone folder → Git Commit → Check Amend Last Commit
    • Using the command line:
      git commit --amend

You can also change author or date information when amending commits. TortoiseGit has additional checkboxes for this while command line Git accepts additional parameters for git commit.

Note that you change the history when amending commits. That means, if you have already pushed your last commit to your personal fork, you have to force-push the changes. That also means, amending a commit that has been pushed to the main ReactOS repository is not possible!

Removing the need to enter your user+password on push (improves security)

Github (similarly to most other Git hosts), offers a way to authenticate your machines using SSH, replacing the need for passwords. These SSH keys are usually generated once per machine, so that if one machine becomes stolen and/or compromised, you can disavow the public key, and prevent extra damage.

In order to use an SSH key, these are the steps:

  1. Get yourself an SSH key. This depends on your environment and platform. My choice is to configure TortoiseGit to use PuTTY, generate my keys with PuTTYgen, and load them with PuTTY's Pageant (authentication agent). Another option is to use OpenSSH tools instead.
    • You can find tutorials of your chosen tool, or ask someone on IRC for help.
  2. Add the public key to your GitHub account, in the account settings, under SSH Keys. Don't ever share the private key!
  3. Change your local clone settings, so that your remotes point to the URL with SSH protocol.
    You can see this URL by navigating to the repository, clicking the Clone or download button and choosing Use SSH.
  4. From this point on, pushing will not ask for your GitHub password, but it WILL ask for the SSH passphrase if your private key has one.