Versions in the Time of Git Dependencies

He allowed himself to be swayed by his conviction that human beings are not born once and for all on the day their mothers give birth to them, but that life obliges them over and over again to give birth to themselves.” ― Gabriel García Márquez, Love in the Time of Cholera

Edit

This is a re-write - the original is below. It now aims to provide some recommendations on how to release for consumption by a git dependency. It all seems very obvious in retrospect

Firstly only publish a single artifact from a single git repository, or several artifacts with identical versions if it is a monorepo. You can imagine schemes that would work with multiple artifacts with independent versions, but tooling is going to have a hard time with any such scheme. This was the piece of the puzzle I was missing in my original post.

Once we've agreed to the above, then it becomes simple - versions become monotonically increasing, and easy for tooling to deal with.

Just put a git tag on a release in the same way you would a version published to maven. For a good scheme see Golang Modules. Make sure the scheme you choose sorts well with version-clj (h/t @borkdude for both).

Remember you can have multiple tags for a given sha, so you could tag it v1.0.0-alpha to start with and promote it to v1.0.0, if that is your cup of tea.

Many thanks to the collective wisdom of Michiel Borkent @borkdude, Alex Miller @puredanger, Sean Corfield and Erik Assum @slipset on the clojurians slack.

Original post below

When you want to consume a library using git dependencies, you go to the project's GitHub page, lookup the SHA from the README, put it in your deps.edn, and your done, right? – But what happens when you want to upgrade? rinse and repeat? How do you even know that a new release is available?

On Git

A git repository is an append only log of changes to a project, and Together with the repository url, the SHA forms a content based addressing scheme to a particular state of the project. This is a natural identifier for that particular project state.

As consumers of a library, we aren't concerned with every single commit made to the repository - we want to know the SHA that the project's maintainers consider to be a release. We might not want the main branch HEAD commit, depending on the branching model used by the project developers.

A CI Pipeline

A good CI pipeline takes an immutable project artifact, and put's it through increasingly vigorous testing. It might start of as an alpha, or a release candidate, and as confidence is increased through testing, it can be promoted to a full release.

An artifact built from the contents of a single git SHA fits this model nicely. In an open source world, we can think of a SHA as an alpha release, that gets tested by a small number of people, and then gets published as a release - tools.build seems to follow this model for example, with announcement of alphas on #tools.build, followed by release announcements, for the same SHA, on #announce after a few people have tried it.

A new SHA, a new Version?

So which SHA do we want to put in our deps.edn?

In the maven world, versions are ordered, so when a new version is published, it is a signal that can be used by tooling to determine if an update is available.

In the git world, a SHA is not ordered, so how do we know when a new version is available? Should we check slack, or a blog, or the project's home page? or could we as project authors provide data to allow automation of this process?

release.edn

To provide release information, we could put the version information into the repository itself.

There are many release schemes, but we can model a project's releases as falling into release streams. Examples of this are "stable" or "alpha" or "v4.x". A release for each stream is then just a SHA associated with the stream.

A natural way to present this would be as a map in an EDN file (or JSON, or YAML, this doesn't need to be clojure specific).

{:stable {:git/tag "v1.0.0" :git/sha "abcdef"}
 :alpha {:git/tag "v1.1.0" :git/sha "abcdef"}
 :head {:git/tag "master" :git/sha :latest}}

A polylith or other monolith repository could have different streams for the various artifacts it published.

If we decide on this format, we then need to make the file discoverable. One way one be to take it from a git repository's default branch, which seems like a good default.

The final piece of the puzzle would be for tooling like antq and clojure-dependency-update-action to use this information.

Good idea?

I'm sure I can't be the first to have thought of this.

What do you think - is this useful? How could the idea be improved?

Discuss this post here.

Published: 2021-11-21

Back to blogging

“My mind turned by anxiety, or other cause, from its scrutiny of blank paper, is like a lost child–wandering the house, sitting on the bottom step to cry.” — Virginia Woolf

I was inspired to write some blog posts, which led me to realise my current blog and blogging setup were completely broken.

Michiel Borkent (@borkdude) recently wrote about his migration from Octopress. His requirements were very similar to mine, so I copied and modified. Thank you Michiel!

His blog, REPL adventures, is well worth the read.

Blog Post Discussions

With a static web site, the perennial problem is how to enable discussions. Some people just punt pn this, and point to reddit, but Michiel's solution is to use github discussions. As a way of owning the discussion, this has a lot of appeal.

I think it could be taken further. It would be great if we could automate the creation of a blog post topic when creating a blog post. Unfortunately the gh command line client doesn't support discussions yet, so that would require using Github's GraphQL API - more work than I wanted to do for now.

One downside though, is that the discussions are not visible on the blog pages, where discussion could easily engender more discussion. I wonder if a Github Action could be triggered by conversation activity, and automatically republish the post with the discussion to date at the end of the post.

Blogging Frameworks vs Tasks

There are many blog site generators (I used Hugo, of course), even if we limit ourselves to clojure:- bootleg, nota, cryogen, and static to name a few.

These are usually feature rich. The price for those features though, is extra complexity.

Michiel's blog uses babashka tasks to add a post, render posts, etc. These are extremely quick to run and make maintaining the blog simple. it does just what he needs, and no more.

This reminds me of project automation, and the tools.build approach of using composable code tasks to build just what is needed.

Maybe there is an opportunity to take the same approach for building a blog or static site. If we could pick from a selection of configurable tasks, maybe we wouldn't need to write our own.

Speaking of which, I have tried writing my own before, in common-lisp: cl-blog-generator.

And…

So I have a blogging setup. Now I just need to write something.

Discuss this post here.

Published: 2021-11-14

Generating Source Files with Leiningen

Recently, we needed to include some generated source files in a project. The source code generation was project specific, so we didn't want to have to create a leiningen plugin specifically for it. To get this to work required using quite a few of leiningen's features.

This post will explain how to use lein to customise you build to generates a source file, but many of the steps are useful to implement any form of lein build customisation.

The Generator

The source code generator is going to live in the my.src-generator namespace. Here's an example, that just generates a namespace declaration for the my.gen namespace under target/generated/my/gen.clj.

(ns my.src-generator
  (:require [clojure.java.io :refer [file]]))

(defn generate []
  (doto (file "target" "generated" "my" "gen.clj")
    (-> #(.getParentFile) #(.mkdirs))
    (spit "(ns my.gen)")))

Development only code

The source generation code should not be packaged in the jar, so we place it in dev-src/my/src_generator.clj, and add dev-src and the generated source directories to the :dev profile's :source-paths. The :dev profile is automatically used by leiningen unless it is producing a jar file. When producing the jar, the dev profile will not be used, so dev-src will not be on the :source-path (we add the generated directory to the base :source-path below).

:profiles {:dev {:source-paths ["src" "dev-src" "target/generated"]}}

Running project specific code with leininingen

The run task can be used to invoke code in your project. To use lein's run task we need to add a -main function to the my.src-generator namespace.

(defn -main [& args]
  (generate))

In the project.clj file we also tell lein about the main namespace. In order to avoid AOT compilation of the main namespace, we mark it with :skip-aot metadata.

:main ^:skip-aot my.src-generator

Customising the jar contents

The generated files need to end up in the jar (and possibly be compiled), so we put them on the :source-paths in the project. If we had wanted to include the sources without further processing, we could have added the generated directory to :resource-paths instead.

:source-paths ["src" "target/generated"]

Extending the build process

Now we can tell lein to generate the source files whenever we use the project. We do this by adding the run task to the :prep-tasks key. Leiningen runs all the tasks in :prep-tasks before any task invoked by the lein command line.

The tricky bit here is that the run task will itself invoke the :prep-tasks, so we want to make sure we don't end up calling the task recursively and generating a stack overflow. To solve this, add a gen profile, and disable the prep tasks in it. We use the :replace metadata to ensure this definition takes precedence. See the leiningen profile documentation for more information on :replace and it's sibling :displace.

:gen {:prep-tasks ^:replace []}

Then use this profile when setting the :prep-tasks key in the project.

:prep-tasks [["with-profile" "+gen,+dev" "run"]  "compile"]

Now when we run any command, the sources are generated.

Adding an alias

Finally we may want to just invoke the source generation, so let's create an alias to make lein gen run the generator. We need the gen profile for this, or otherwise the generator will run twice.

:aliases {"gen" ["with-profile" "+gen,+dev" "run"]}

The final project.clj

For reference, the final project.clj looks like this:

(defproject my-proj "0.1.0-SNAPSHOT"
  :dependencies [[org.clojure/clojure "1.4.0"]]
  :source-paths ["src" "target/generated"]
  :main ^:skip-aot my.src-generator
  :prep-tasks [["with-profile" "+gen,+dev" "run"]  "compile"]
  :profiles {:dev {:source-paths ["src" "dev-src" "target/generated"]}
             :gen {:prep-tasks ^:replace []}}
  :aliases {"gen" ["with-profile" "+gen,+dev" "run"]})

Conclusion

This required using many of lein's features to get working - hopefully you'll find a use for some of them.

Discuss this post here.

Published: 2013-10-28

Pelure

Versions in the Time of Git Dependencies

On Git

A CI Pipeline

A new SHA, a new Version?

release.edn

Good idea?

Back to blogging

Blog Post Discussions

Blogging Frameworks vs Tasks

And…

Generating Source Files with Leiningen

The Generator

Development only code

Running project specific code with leininingen

Customising the jar contents

Extending the build process

Adding an alias

The final project.clj

Conclusion