Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Updated hints for PR testing. Moved section

...

  1. Fork the DSpace GitHub Repo to store your local changes: As GitHub describes in their "Fork a Repo" guide, forking lets you create your own personal copy of the codebase. It not only provides you a place to put your local customizations. It also provides an easier way to contribute your work back to the DSpace community (via a GitHub Pull Request as described in the "#Contributing Changes/Patches to DSpace via GitHub" section below).
    • You can fork the repository directly from the GitHub User Interface. Just create an account at GitHub. Then browse to the DSpace GitHub repository (https://github.com/DSpace/DSpace) and click the "Fork" button at the top of the page. This creates a full copy of that repository under your GitHub account (e.g. https://github.com/\[your-username\]/DSpace)
  2. Clone your GitHub Repo to your local machine. Now that you have a fork of the DSpace GitHub repository, you'll want to "clone" your repository to your local machine (so that you can commit to it, etc.). You can clone it to whatever directory you wish. You can click the clone button in the Github web interface, or you can use the following command line instruction. In the below example we call the directory "dspace-src". For this to work, you need to setup an ssh key with Github first:

    Code Block
    git clone git@github.com:[your-username]/DSpace.git dspace-src
    cd dspace-src

    You now have the full DSpace source code, and it's also in a locally cloned git repository!

  3. For easier Fetches/Merges, setup an "upstream" repository location. If you have forked the DSpace GitHub repository, then you may want to setup an "upstream" remote that points at the central DSpace GitHub repository. It basically just provides you with an easier to remember "name" for the central DSpace GitHub repository. This is described in more detail in the GitHub "Fork a Repo" guide. The second command will download all new references (branch names, tags, ...) from the upstream repo into your local repo.

    Code Block
    git remote add upstream git://github.com/DSpace/DSpace.git
    git fetch upstream

    (Technically you can name it something other than "upstream". But, "upstream" is just the GitHub recommended naming convention to avoid confusion when using "origin" for your personal fork of "upstream" on GitHub, which you need to submit pull requests to "upstream").

  4. Create a branch for each new feature/bug you are working on. Because Git makes branching & merging easy (see Pro-Git's chapter on "Basic Branching & Merging"), you should create new branches frequently (even several times a day) and avoid working directly in the master branch (unless you are making a very minor change). In this case, we'll create a local branch named "DS-123" (note that this branch only exists on your local machine so far). We'll also perform a "checkout" in order to switch over to using this new branch.

    Code Block
    git branch DS-123
    git checkout DS-123
  5. Do your development work on your new branch, committing changes as you go. Note that at this point, you are only committing changes to your local machine. Nothing new will show up in GitHub yet, until you pushit there. This is a very basic example of a single file commit, but you get the idea.

    Code Block
    git commit NameOfFileToCommit.java
  6. Optionally, you can push these changes and this "feature" branch up to your GitHub account. If you want to share your work more publicly, you can push the changes and your new branch up to your personal GitHub repository:

    Code Block
    git push origin DS-123

    In this command "origin" is actually the name of the repository that you initially cloned (from your own personal GitHub account). This pushes your new branch up to GitHub, so that it is publicly available to other developers.

  7. Optionally, generate a Pull Request to DSpace GitHub. If this is code you feel should be added to the main DSpace GitHub, you should generate a Pull Request from your "feature branch" (DS-123) that you pushed in the previous step.  This will notify the DSpace Committers that you have a new feature or bug fix which you feel is worth adding to the main DSpace codebase. The Committers will then review this Pull Request and let you know if it can be accepted.
    1. For more details on generating a Pull Request for review, see the  Contributing Changes to DSpace via GitHub section below.
    2. Always make sure to generate a Pull Request from a "feature branch" (e.g. branch "DS-123").  You should never generate a Pull Request from your "master" branch.  A Pull Request just points at a branch, so any new commits you add to that branch will be immediately reflected in your previously created Pull Request.
  8. Optionally, once you have finished your work, you may wish to merge your changes to your "master" branch. Your personal "master" branch is where all your completed code should eventually be merged ("master" is loosely equivalent to "trunk" in Subversion). So, once you are done with the branch development, you should merge that code back into your "master" branch. Luckily, Git makes this simple and will figure out the best way to merge the code for you. In rare situations you may encounter conflicts which Git will tell you to resolve. For more details, see Pro-Git's chapter on "Basic Branching & Merging". In order to perform the merge, you'll first need to switch over to the "master" branch (the branch you are merging into):

    Code Block
    git checkout master
    git merge DS-123

    There! You've now merged the changes you made on the "DS-123" branch into your personal "master" branch!

  9. Optionally, push this merge up to your GitHub account. Again, at any time, you can push your local changes up to your GitHub account for public sharing. So, if you want to push your newly merged "master" branch, you'd do the following:

    Code Block
    git push origin master

    (I.e. You are pushing your local "master" branch up to the "origin" repository at GitHub. Remember, "origin" refers to the repository you initially cloned, which in this example would be your personal GitHub repo that you cloned in Step #1 above.)

  10. Once your branch is no longer needed, you can delete it. Really, there's no need to keep around all these small branches! If you generated a Pull Request from this branch, you will need to keep it around until the Pull Request is either merged or closed. But, once you no longer have any other use for the branch you created, just delete it! Here's an example of deleting the "DS-123" branch from both your local machine and from your public GitHub account (if you shared it there)

    Code Block
    # Remove the branch locally first
    git branch -d DS-123
    # If you have pushed it to GitHub, you can also remove it there by doing a new push (notice the ":")
    git push origin :DS-123
  11. Fetch changes from central DSpace GitHub. New changes/updates/bug fixes happen all the time. So, you want to be able to keep your "fork" up-to-date with the central DSpace GitHub. In this case, you now can take advantage of the "upstream" remote setting that you setup back in Step #2 above. If you recall, in that step, you configured "upstream" to actually point to the central DSpace GitHub repo. So, if there are changes made to the central DSpace GitHub, you can fetch them into your "master" branch as follows:

    Code Block
    # Fetch the changes from the repo you named "upstream"
    git fetch upstream
    

    What this command has done is actually create a new "upstream/master" branch (on your local machine) with the latest changes to be merged from that "upstream" repository.

  12. Merge changes into your Local repository. Remember, "fetching" changes just brings a copy of those changes down to your local machine. You'll then need to merge those changes into your "master" branch, and optionally push the changes back to your personal public GitHub repository.

    Code Block
    # First, make sure we are on "master" branch
    git checkout master
    # Now, merge the changes in the "upstream/master" branch into my "master" branch
    git merge upstream/master

    In this case, Git will attempt to merge any new changes made in the "upstream" repository into your local "master" branch.

  13. Push those merged changes back up to GitHub. Once you are up-to-date, you may now want to push your latest merge back up to your public GitHub repository.

    Code Block
    git push origin master
    # If the 'fetch' above pulled down new tags/branches, you also may wish to run the following to push those to your own repo.
    git push origin --all
Easy Pull Request testing using Git

If you have added an "upstream" repository to your clone of your fork, as described above, here's a handy command to make checking out Pull Requests for testing purposes (inspired by this help page on the GitHub site):

Warning
title70MB download

Because of the high number of pull requests, the trick below results in a ~70MB download when you first checkout these pull request branches. Not recommended to do this on a low bandwidth connection.

Code Block
git config --add remote.upstream.fetch +refs/pull/*/head:refs/remotes/upstream/pr/*
 
# to fetch *all* the pull requests, type this
git fetch upstream
 
# and check out a specific PR into a local branch (named whatever you want)
git checkout pr/248 -b [local-branch-to-create]
 
# For example, this checks out PR #248 into a local branch. You can name it whatever you want,
# but in this example it is using this format:
# [JIRA-ticket-number]-[PR-number]-[description]
git checkout pr/248 -b "DS-1597-PR-248-test-for-oracle-compatibility"

Additional Handy Git Commands

  • git status - At any time, you may use this command to determine the status of your local git repository and how many commits ahead or behind it may be from the "origin" repository at GitHub. It also tells you if you have local changes that you haven't yet committed. For more info type: git help status
  • git log - At any time, you may use this command to see a log of recent commits you've made to the current branch. For more info type: git help log
  • git diff - At any time, you may use this command to see differences of your current in progress work. For more info type: git help diff
  • git pull - Can be used instead of using git fetch followed by git merge. A "pull" does both a fetch and an automated "merge" in a single step.
  • git stash (See also: Stashing) - Allows you to temporarily "stash" uncommitted changes. This command is extremely useful if you want to do a "pull" or "merge" but were working on something else. You can temporarily "stash" what you are working on to perform the merge/pull, and then use git stash apply to reapply your stashed work.
  • git rebase (See also: Rebasing) - This tool is extremely powerful, and can be used to reorganize or combine commits that have been made on a local branch. It can also be used in place of a "merge" (in any of the situations described above). However, as it changes your commit history, you should NEVER USE REBASE ON ANY BRANCH THAT HAS BEEN PUBLICLY SHARED ON GITHUB. For more information, see Pro-Git's chapter on Rebasing and GitHub's 'rebase' page.
  • git cherry-pick - useful to grab a single commit into current branch from any of the local branches or remote branches added locally. A handy use-case is when we have a master branch and a bugfix branch (e.g. dspace-5_x) and you want a bugfix to go to both branches, you can commit to either branch, then switch to the other and execute "git cherry-pick [commit-hash]". Don't forget to push both branches, as usual. For details, see Getting a commit to multiple branches (backporting)
GitHub tips

If you want to show someone the diff between two commits, you can do it directly using GitHub functionality. Example:

...

Additional Handy Git Commands

  • git status - At any time, you may use this command to determine the status of your local git repository and how many commits ahead or behind it may be from the "origin" repository at GitHub. It also tells you if you have local changes that you haven't yet committed. For more info type: git help status
  • git log - At any time, you may use this command to see a log of recent commits you've made to the current branch. For more info type: git help log
  • git diff - At any time, you may use this command to see differences of your current in progress work. For more info type: git help diff
  • git pull - Can be used instead of using git fetch followed by git merge. A "pull" does both a fetch and an automated "merge" in a single step.
  • git stash (See also: Stashing) - Allows you to temporarily "stash" uncommitted changes. This command is extremely useful if you want to do a "pull" or "merge" but were working on something else. You can temporarily "stash" what you are working on to perform the merge/pull, and then use git stash apply to reapply your stashed work.
  • git rebase (See also: Rebasing) - This tool is extremely powerful, and can be used to reorganize or combine commits that have been made on a local branch. It can also be used in place of a "merge" (in any of the situations described above). However, as it changes your commit history, you should NEVER USE REBASE ON ANY BRANCH THAT HAS BEEN PUBLICLY SHARED ON GITHUB. For more information, see Pro-Git's chapter on Rebasing and GitHub's 'rebase' page.
  • git cherry-pick - useful to grab a single commit into current branch from any of the local branches or remote branches added locally. A handy use-case is when we have a master branch and a bugfix branch (e.g. dspace-5_x) and you want a bugfix to go to both branches, you can commit to either branch, then switch to the other and execute "git cherry-pick [commit-hash]". Don't forget to push both branches, as usual. For details, see Getting a commit to multiple branches (backporting)
GitHub tips

If you want to show someone the diff between two commits, you can do it directly using GitHub functionality. Example:

https://github.com/DSpace/DSpace/compare/fde129026febcd58af030e14c7a7f82bd201033b...dspace-3.0

As you can see, the commit can be specified either as a hash or as a tag (they're interchangable). Bonus tip: the two commits don't even have to be in the same repository, so you can compare e.g. your fork to the official repo.

Contributing Changes to DSpace via GitHub

While we're still working out the ideal workflow for contributions, existing Committers will have direct push access to the DSpace GitHub repo, while contributors are encouraged to submit a Pull Request for review.

Creating & Updating Pull Requests

  1. Creating the PR: Please, make sure to create a Pull Request from a branch and NOT from your "master". (You'll understand exactly why after reading #2)
  2. Updating the PR: To update the Pull Request just simply add a new commit to the branch it was created from. Conversely, be warned that any additional changes/commits you make to that PR branch (before the "Pull Request" is accepted/merged) will immediately be included in that existing "Pull Request". This means that, if you want to continue your local development, you must create that "Pull Request" from a semi-static branch (so that any additional commits you make on your local "master" in the meantime don't get auto-included as part of your existing Pull Request).
    • The reason why this occurs is that a "Pull Request" just points at a specific "branch" (the branch it was initialized from). It does NOT point at a specific set of commits. So, when the "Pull Request" is accepted/merged, you are pulling in the latest version of that "branch". For more information, closely read the GitHub help page on Pull Requests, specifically noting the following statement:

      Pull requests can be sent from any branch or commit but it’s recommended that a topic branch be used so that follow-up commits can be pushed to update the pull request if necessary.

  3. Communicating about your PR: Once your Pull Request is created, you can use the GitHub Pull Request tools to communicate with the Committer who is assigned to the Pull Request. If further changes are requested, you can make those changes on the branch where you initiated the Pull Request (and those changes will automatically become part of the Pull Request, as described above)
  4. Squashing or cleaning up your PR: If you ever want to "clean up" your Pull Request, we

Contributing Changes to DSpace via GitHub

While we're still working out the ideal workflow for contributions, existing Committers will have direct push access to the DSpace GitHub repo, while contributors are encouraged to submit a Pull Request for review.

Creating & Updating Pull Requests

  1. Creating the PR: Please, make sure to create a Pull Request from a branch and NOT from your "master". (You'll understand exactly why after reading #2)
  2. Updating the PR: To update the Pull Request just simply add a new commit to the branch it was created from. Conversely, be warned that any additional changes/commits you make to that PR branch (before the "Pull Request" is accepted/merged) will immediately be included in that existing "Pull Request". This means that, if you want to continue your local development, you must create that "Pull Request" from a semi-static branch (so that any additional commits you make on your local "master" in the meantime don't get auto-included as part of your existing Pull Request).
    • The reason why this occurs is that a "Pull Request" just points at a specific "branch" (the branch it was initialized from). It does NOT point at a specific set of commits. So, when the "Pull Request" is accepted/merged, you are pulling in the latest version of that "branch". For more information, closely read the GitHub help page on Pull Requests, specifically noting the following statement:

      Pull requests can be sent from any branch or commit but it’s recommended that a topic branch be used so that follow-up commits can be pushed to update the pull request if necessary.

  3. Communicating about your PR: Once your Pull Request is created, you can use the GitHub Pull Request tools to communicate with the Committer who is assigned to the Pull Request. If further changes are requested, you can make those changes on the branch where you initiated the Pull Request (and those changes will automatically become part of the Pull Request, as described above)
  4. Squashing or cleaning up your PR: If you ever want to "clean up" your Pull Request, we recommend using 'rebase' to "squash" several related commits into one. Here's a good example: http://stackoverflow.com/a/15055649 (see the section on "Cleaning Commit History")

...

If you don't have a ssh key generated yet, you can generate one using:

Code Block
ssh-keygen -t dsa

The recommended setup is as follows:

  • Read/write access to the DSpace/DSpace repo. The git remote should be named "upstream" in your local clone.
  • Read/write access to your YourName/DSpace repo. This is a fork of the DSpace/DSpace repo (created by clicking the "Fork" button on GitHub). You should clone your local repo from this, therefore it will be visible as the "origin" remote in your local repo.
  • A local repo on your machine (or multiple repos if you work on multiple machines).
Code Block
# make sure you have forked the DSpace/DSpace repo on GitHub to your OWN GitHub Account
git clone git@github.com:YourName/DSpace.git
cd DSpace
git remote add upstream git@github.com:DSpace/DSpace.git
git fetch upstream
# now "git remote -v show" should look like this:
origin git@github.com:YourName/DSpace.git (fetch)
origin git@github.com:YourName/DSpace.git (push)
upstream git@github.com:DSpace/DSpace.git (fetch)
upstream git@github.com:DSpace/DSpace.git (push)

...

yet, you can generate one using:

Code Block
ssh-keygen -t dsa

The recommended setup is as follows:

  • Read/write access to the DSpace/DSpace repo. The git remote should be named "upstream" in your local clone.
  • Read/write access to your YourName/DSpace repo. This is a fork of the DSpace/DSpace repo (created by clicking the "Fork" button on GitHub). You should clone your local repo from this, therefore it will be visible as the "origin" remote in your local repo.
  • A local repo on your machine (or multiple repos if you work on multiple machines).
Code Block
# make sure you have forked the DSpace/DSpace repo on GitHub to your OWN GitHub Account
git clone git@github.com:YourName/DSpace.git
cd DSpace
git remote add upstream git@github.com:DSpace/DSpace.git
git fetch upstream
# now "git remote -v show" should look like this:
origin git@github.com:YourName/DSpace.git (fetch)
origin git@github.com:YourName/DSpace.git (push)
upstream git@github.com:DSpace/DSpace.git (fetch)
upstream git@github.com:DSpace/DSpace.git (push)

For more information, and a sample git workflow, see Developing from a Forked Repository section above.

Testing Pull Requests

Testing a Single Pull Request

If you want to test a single pull request, there's two options:

  • Follow the "Command Line Instructions" on the PR itself.  Simply visit the PR page and click the "Command Line Instructions" link next to the big green button.
  • OR, if the PR is outdated or older, it may be easier to checkout the PR directly. This is described in more detail at: https://help.github.com/articles/checking-out-pull-requests-locally/

Here's the quick steps for checking out a single PR directly (borrowed from the GitHub instructions listed above)

Code Block
# Assumes the DSpace/DSpace repo is already setup as your "upstream" repository
# [ID] = Pull Request number (at the end of the URL)
# [LOCAL-BRANCH] = Name of the branch you want to create (locally) to work on this PR
git fetch upstream pull/[ID]/head:[LOCAL-BRANCH]

# Now checkout your newly created local branch
git checkout [LOCAL-BRANCH]

# Optionally, you may want/need to rebase this PR based on the latest code on master
git rebase master

# You can now do anything you want on this local branch to test or update/change the code

# Optionally, if you need to, you can push this local branch back up to your forked repo (origin)
# And even create a *new* updated PR from it (via the GitHub UI: https://help.github.com/articles/creating-a-pull-request)
git push origin [LOCAL-BRANCH]


Pulling down all open Pull Requests

If you have added an "upstream" repository to your clone of your fork, here's a handy command to make checking out Pull Requests for testing purposes (inspired by this help page on the GitHub site):

Warning
title70MB download

Because of the high number of pull requests, the trick below results in a ~70MB download when you first checkout these pull request branches. Not recommended to do this on a low bandwidth connection.

Code Block
git config --add remote.upstream.fetch +refs/pull/*/head:refs/remotes/upstream/pr/*
 
# to fetch *all* the pull requests, type this
git fetch upstream
 
# and check out a specific PR into a local branch (named whatever you want)
git checkout pr/248 -b [LOCAL-BRANCH]
 
# For example, this checks out PR #248 into a local branch. You can name it whatever you want,
# but in this example it is using this format:
# [JIRA-ticket-number]-[PR-number]-[description]
git checkout pr/248 -b "DS-1597-PR-248-test-for-oracle-compatibility"

# Now you can test this local branch, update it, etc. as you need to
# As necessary, you can also create a *new* PR from it by optionally pushing it up to your forked repo
git push origin [LOCAL-BRANCH]


Getting a commit to multiple branches (backporting)

...