Massive refactor by tarsius · Pull Request #15 · melpa/package-build

tarsius · 2018-02-22T18:41:41Z

Subsequently I am criticizing the existing code quiet harshly. I am pretty sure that @purcell and @milkypostman will understand that correctly, but others might read this as well, so I would like to emphasize that this is not intended as a criticism of the authors or their skills. I maintain a project similar to Melpa and as part of that maintain tools similar to this one, and know how much work it is to just keep things going. It is very tempting and often the only course of action that won't make you burn out or otherwise go insane to just keep adding features and fixing bugs without taking a step back and refactor.

Motivation

A long time ago I have said that I would like to replace the timestamps that are used on Melpa (non-stable) with version strings of the form VERSION.UPDATE-COUNT. Or something like that. I think that the plan outlined in Evaluating a new version numbering scheme melpa#2955 is fairly sound, but I haven't looked at it in a while.

Many others have expressed interest in something like this but nobody has implemented anything. Anyway the point is that I think it should be fairly simple to do that. But when I went looking for how to inject that feature into the existing code I ran into a lot of accidental complexity.
I am the maintain the Emacsmirror, which in many ways is similar to Melpa. The primary output of Melpa is a collection of package.el-compatible packages, while the primary output of the Emacsmirror is a database containing a list of Git repositories containing Emacs packages, as well as lots of metadata about these packages.

For most users the output of Melpa is more useful, but I am happy that there now are a few users who use my Borg package manager and my Epkg manager.

But my work on the Emacsmirror benefits Melpa users as well. In the early days of the Emacsmirror I went looking for packages to add myself. Nowadays I rely on Melpa for that purpose. Some time after a package is added to Melpa, I also semi-automatically add it to the Emacsmirror.

Because the Emacsmirror extracts more metadata about packages than Melpa does, that often leads to the discovery of issues that @purcell did not catch during his (very valuable) review. I then contact the maintainer and ask them to fix the issues, or even do it for them. These fixes make it into Melpa as well on the next update.

Additionally, even though I don't use package.el myself, I am a top contributor to Melpa (right after the two head honchos). By comparing the data found in Melpa and the Emacsmirror I not only find issues in individual packages but also to issues in Melpa recipes and that leads to a lot of commits.

Anyway the Emacsmirror extracts a lot more metadata about packages than Melpa (or that package.el currently supports (give epkg a try)) and some of that could be ported to Melpa (and package.el). But again the accidental complexity gets in the way.

The Emacsmirror too could benefit from Melpa more if package-build.el were more similar to the respective tools used to maintain the Emacsmirror.
The reason I am doing this now is that after the next release Magit's development will move to two long-lived branches (likely maint and master). I would like Melpa Stable to continue to contain releases such as 2.12.1, Melpa to stay on maint on offer versions like 2.12.1.5, and my own Elpa archive to offer versions like 2.99.101 from master. For that I need VERSION.UPDATE-COUNT support, as well as some other things that I don't want to implement on top of the current package-build.el implementation.

Melpa could be improved in many ways, and many users request such changes, but when they are told by the overworked maintainers to lend a hand not much happens. I think this is partially due to the accidental complexity that makes it hard to add something new. The primary goal of these commits is to remove some of that complexity so that package-build.el and by extension Melpa can grow new features.

Status

I have only lightly tested these changes and I guess I should recommend that this pull-request isn't merged until we have done some more testing.

Similar pull-requests that I created in the past tended to have some bugs. These bugs were rather silly and "easily preventable", but that wasn't only the result of too little testing but is also to be expected when replacing accidental complexity with a cleaner implementation. When there are many unnecessary complications, then it is easy to mistake a complication that actually is necessary for one of the unnecessary ones.

So maybe it wouldn't be a bad idea to merge this pretty soon anyway, but on a day where we all are available to quickly fix any regressions encountered by users.
This is only a step toward a cleaner code base.

There are some temporary hacks. I have removed many obsolete doc-strings without replacing them with new ones, because I only have so much motivation to improve things incrementally. I am doing it for the code, but the documentation I don't want to update until things are done instead of over and over again.

Also I have to turn my attention to other things now. So it could be a while until the next patch series.

Changes

Make code that determines the appropriate version self-contained.
Use fewer arguments to pass data around.
Use Eieio objects to represent recipes instead of plists. These objects contain more information than the old plists, which is what makes it possible to use fewer arguments to pass information around.
Replace leaky abstractions with self-contained functions, doing one thing only, and all of it.

This too is made easier by the use of objects that contain all the necessary information with no overhead such as having to add additional arguments to many functions just so that the work can be done where it makes most sense to do it. Previously some work was done in the wrong place, simply because that was the only place where the required information was available.
Organized the file into logical units by using sections. This was a very necessary step, which I already began in earlier patch series. It helps a lot recognizing accidental complexity and removing it.

This required moving some existing function definitions. The order of the sections themselves and the order of definitions within a given section is still fairly random.
Remove many helper functions, by inlining them. These helper functions often existed because the functions that called them were to complicated, making it necessary to move at least some of the code elsewhere. But after cleaning up the calling function that often wasn't necessary anymore.

The existence of many tiny helper functions that are defined in weird places and have sub-optimal names and doc-strings, makes it very hard to fully grasp what a certain task involved.

Having everything in one place helps a lot here. Which isn't to say that tiny helper functions are bad per se, but at this point in the evolution of the code base, I feel they do more harm than good. Also they often existed to cover up another issue, not to increase the separation of concerns.

This patch series does not fix all of the mentioned issues. It fixes some instances of each of the mentioned issue categories, while leaving others untouched.

xiongtx · 2018-02-22T19:12:13Z

package-recipe.el

+  :abstract t)
+
+(defun package-recipe-lookup (name)
+  (let ((plist (cdr (assq (intern name) (package-build-recipe-alist)))))


assoc-default takes a test arg; IIUC (assoc-default (intern name) (package-build-recipe-alist) #'eq) should do the same.

And what would be the benefit of using assoc-default here? The existing code seems to do exactly what I want it to do. Do you want it to do something else? Why?

I don't see how this is an improvement.

-(assq ...) +(assoc-default ... #'eq)

Makes the intent to get the key rather than the pair clearer to me. Up to you, of course.

assq is indeed more idiomatic in this case; the intent is clearest (to me) with that.

vermiculus · 2018-02-23T23:55:25Z

package-recipe.el

+;;; Classes
+
+(defclass package-recipe ()
+  ((url-format      :allocation :class       :initform nil)


Are all the :initform nil specifications really necessary? One would assume that nil would be the default anyway.

They are, in fact, necessary.

(defclass foo () ((foo-field))) (oref (make-instance 'foo) foo-field) ;; => Debugger entered--Lisp error: (unbound-slot foo "#<foo foo>" foo-field oref)

Just because it is an error to access an unbound slot doesn't mean that we have to make sure it is never unbound. I have just decided that preventing that kind of error would be a good idea during this stage of development.

vermiculus · 2018-02-24T00:16:32Z

package-recipe.el

+  :abstract t)
+
+(defun package-recipe-lookup (name)
+  (let ((plist (cdr (assq (intern name) (package-build-recipe-alist)))))


I presume plist here is the actual recipe submitted by contributors? e.g., recipes/magit?

Well the cdr of that.

And move the definitions of `package-build-reinitialize' and `package-build--read-from-file' in the process.

Do not print the full source and destination file. That is so noisy that nobody will actually inspect the output. Also prefix every line about a renamed file with "!". That way it might happen that someone notices if something strange is going on due to an invalid file spec.

Magit hasn't been doing that for a few years now.

The new value is more to the point and also better matches the name of the variable all callers assign the value to, `tag-version'.

...instead of using a REGEXP to find tags between the point and BOUND. The new, optional REGEXP has nothing to do with the old, removed argument by the same name.

Also change all callers to use `cl-destructuring-bind' to split the returned value apart.

Add a new argument DESTINATION, which controls where output goes. If it is nil, then it goes into the buffer named "*package-build-checkout*", which is very useful because previously most callers had to make that buffer current to keep this function from cluttering some buffer. If it is t then output goes into the current buffer, which is useful because for a few callers that is the right thing to do. If it is an actual buffer, then output goes into that buffer, which isn't really used, but makes this somewhat consistent with built-in functions that run sub-processes.

REPO is ambiguous - it might refer to the local repository or to the url of the remote repository "connected" to that repository. URL on the other hand is quite clear.

REPO is ambiguous - it might refer to the local repository or to the url of the remote repository "connected" to that repository. URL on the other hand is quite clear. The "git"/"hg" part won't be needed anymore once we switch to using eieio. So you can ignore that noise. The "used" is added to differentiate these functions from the method `package-build--upstream-url', which will be added soon. Also move the definitions of these functions.

Instead of passing its value around, consult the variable `package-build-archive-dir' when its value is actually needed.

Instead of passing the value around, determine it once it is needed. Add a new function `package-recipe--working-tree' to do so. The name of that function does not match the name of the library that defines it because it will be moved to a library with a matching name in a few commits. Don't do this for a few functions that will be removed soon anyway. In many cases the DIR argument is replaced with a new argument NAME because `package-recipe--working-tree' needs to know for what package it should return the working tree. However in a later step that new argument as well as the existing CONFIG argument will be replaced with a single argument RCP. What is being done here is but a step in that direction.

Remove the unnecessary hand-crafted method dispatch mechanism and replace it with a simple `cl-case', which is more than sufficient, considering that only one of the "methods" actually is implemented. Regardless, the next commit will turn this into a proper generic function.

... absorbing `package-build--update-git-to-ref', `package-build--git-head-branch' and some code from `package-build--checkout-git'. Because this function will soon be absorbed by another new function, don't bother coming up with a better name.

... replacing PACKAGE-NAME argument. That way we can use the information that is contained in RCP, which we will start doing in the next commit.

Previously its task was performed by `package-build-archive', which shouldn't have to concern itself with this sort of thing.

And remove `package-build-add-to-archive' in the process.

This looked rather silly and probably was only useful while debugging some issue that has been fixed a long time ago.

Replace it with a message in `package-build--checkout', which isn't weird, but still quite unnecessary if you ask me.

purcell

Thanks. This looks like a solid piece of work, but it's also extensive, so I'm consequently not sure how best to fully assess & communicate the likely downstream effects not just on MELPA but on Cask, Quelpa etc., and any other code that uses this.

However, I think if those circular references are resolved, it would be fine to merge this here, then I'd do a bit of testing and move MELPA over to this version.

purcell · 2018-03-18T04:35:34Z

Makefile

@@ -0,0 +1,66 @@
+-include config.mk


Does this Makefile exist only to generate autoloads? Not sure why this would be necessary at all.

Makefiles such as this one (I use very similar ones for most of my packages) are the reason why my packages have significantly fewer compile warnings than other elisp. It's just very simple to type make and learn whether you only didn't get any warnings when compiling from within Emacs because you happened to also have the unspecified dependencies loaded among other repl vs. separate compile step issues.

Yes, it is a a simple Makefile, but that doesn't mean it is worthless. Rather that's the result of reducing it to the essential parts over the course of a few years. Other people use Cask instead. That looks much more impressive not least because it adds a dependency on Python and comes with a manual and does project specific dependency management (which makes sense for the build systems that idea was borrowed from, much less for Emacs which doesn't support loading different versions into the "vm").

Yup, cool. I'd be happy to end up with a Travis build for this, and the makefile would be useful in that case.

I've started working on that but need you to enable it at https://travis-ci.org/profile/melpa.

purcell · 2018-03-18T04:37:20Z

package-recipe.el

+
+(require 'eieio)
+
+(declare-function package-build-recipe-alist "package-build" ())


This circular reference indicates to me that the division into files isn't quite right here.

Agreed, but I am also not done...

purcell · 2018-03-18T04:38:17Z

package-recipe.el

+   (old-names       :initarg :old-names      :initform nil))
+  :abstract t)
+
+(defun package-recipe-lookup (name)


Yep, I think these bits don't belong in here, though I really like the introduction of the data structure and github/hg/bitbucket specialisations.

... I tend to come to the opposite conclusion. I think all the "read recipes from filesystem" stuff should go into package-recipe.el.

Or maybe not, but what I have done here is to start implementing a new interface to the data stored on disk, while keeping the old interface intact. The circular reference is due to the new interface being just a stub around the old interface for now.

Okay, if it's not the end game, I'm cool with that.

purcell · 2018-03-18T04:43:11Z

package-build.el

-                                (plist-get config :url)))
-    (funcall (intern (format "package-build--checkout-%s" fetcher))
-             name config (file-name-as-directory working-dir))))
+(cl-defmethod package-build--checkout :before ((rcp package-recipe))


I'm sure this code is fine, but it's also much less easy to understand than before IMO. I haven't worked much with the eieio objects + cl-defmethod style, and other contributors likely haven't either. Doesn't mean I don't like it, but that's a potential consideration.

Yes I know but ... ah I see you are talking about the :before. This code could also be moved into package-build-archive. I would prefer to leave it like this for a while though. It just messages some information and is rather irrelevant to the flow of things, which makes it desirable to separate it from the important parts. Once those parts are finished, the messages can be moved to the place that eventually turns out to be the correct one.

Actually I think I would have liked to remove these messages completely but kept them to avoid unnecessary changes.

This is the context in which these three lines are printed:

• Building package magit ... Package: magit Fetcher: github Source: https://github.com/magit/magit.git Cloning https://github.com/magit/magit.git to /home/jonas/git/emacs/melpa/working/magit/

Package: magit is unnecessary, the same information is printed right above.

Source: https://github.com/magit/magit.git is unnecessary, the same information is printed right below.

Fetcher: github, while not redundant, also isn't really useful here. This is not the place were one would make sure that a recipe author used e.g. github instead of the generic git. Also despite this output you occasionally overlook such a mistake. I do notice and fix them later on, but I rely on other tools to do so.

So let's just remove this.

purcell · 2018-03-18T04:44:24Z

package-build.el

-    (package-build--checkout-hg name (plist-put (copy-sequence config) :url url) dir)))
-
-;;; Utilities
+(cl-defmethod package-build--checkout ((rcp package-hg-recipe))


This is really nice though.

purcell · 2018-03-18T04:45:52Z

package-build.el

+(defun package-build--package-buffer-info-vec ()
+  "Return a vector of package info.
+`package-buffer-info' returns a vector in older Emacs versions,
+and a cl struct in Emacs HEAD.  This wrapper normalises the results."


Maybe we should start normalising into a cl struct...

That would be better I agree. The easiest way of doing that though would be to simply drop support for older Emacsen.

I'd be open to that. Any idea when the switch was made?

purcell · 2018-03-18T04:47:08Z

package-build.el

-    (package-build--message "%s => %s" file newname)
-    (copy-directory file newname))))
+  (package-build--message
+   "Copying files (->) and directories (=>)\n  from %s\n  to %s"


tarsius · 2018-03-24T23:55:26Z

I'm consequently not sure how best to fully assess & communicate the likely downstream effects not just on MELPA but on Cask, Quelpa etc., and any other code that uses this.

We could just use it for Melpa for a while.

purcell · 2018-03-25T03:42:27Z

We could just use it for Melpa for a while.

Sounds like a plan!

xiongtx reviewed Feb 22, 2018

View reviewed changes

tarsius force-pushed the refactor branch from f42c3a3 to 544bb3b Compare February 22, 2018 19:45

vermiculus reviewed Feb 23, 2018

View reviewed changes

vermiculus reviewed Feb 24, 2018

View reviewed changes

tarsius added 26 commits March 23, 2018 19:17

Add Makefile

c6c6fe0

package-build.el: move a misplaced section heading

8c53784

package-build.el: split into more sections

211db66

And move the definitions of `package-build-reinitialize' and `package-build--read-from-file' in the process.

package-build.el: merge the two "Recipes" sections

a468d10

package-build--delete-file-if-exists: inline

653b78c

package-build--copy-file: inline

ad3447b

package-build--build-multi-file-package: correct a comment

b60a2c8

Magit hasn't been doing that for a few years now.

package-build--read-recipe: drop obsolete :module keyword

11ff7e8

package-build--read-recipe: update keyword validation

d563a8c

package-build--find-version-newest: refactor

1ac04a1

package-build--find-version-newest: rearrange return value

fcfe0b0

The new value is more to the point and also better matches the name of the variable all callers assign the value to, `tag-version'.

package-build--find-version-newest: among list of TAGS

e6c859e

...instead of using a REGEXP to find tags between the point and BOUND. The new, optional REGEXP has nothing to do with the old, removed argument by the same name.

package-build--find-version-newest: return version as a string

ce07f53

Also change all callers to use `cl-destructuring-bind' to split the returned value apart.

package-build--valid-version: inline

c47e02c

package-build--process-lines: new function

ecbcc70

package-build--find-parse-time{,-newest}: remove

ba56da5

package-build--checkout-{git,hg}: rename an internal variable

4f963ef

REPO is ambiguous - it might refer to the local repository or to the url of the remote repository "connected" to that repository. URL on the other hand is quite clear.

Remove the TARGET-DIR argument of many functions

252e3cb

Instead of passing its value around, consult the variable `package-build-archive-dir' when its value is actually needed.

package-build-archive: cosmetics

6daf05f

package-build-archive: include fetch duration in build duration

80ef08e

tarsius added 11 commits March 23, 2018 19:17

Add new library package-recipe

f125214

package-build--build-{single,multi}-file-package: add RCP argument

28a9ee7

... replacing PACKAGE-NAME argument. That way we can use the information that is contained in RCP, which we will start doing in the next commit.

package-build--write-entry-file: new function

a5b0c32

Previously its task was performed by `package-build-archive', which shouldn't have to concern itself with this sort of thing.

package-build--{,write-}archive-entry: refactor

b8e43f2

And remove `package-build-add-to-archive' in the process.

package-build--find-source-file: inline

5e928fc

package-build--package-buffer-info-vec: move definition

973a1fe

package-build-get-commit: move definitions

1b0406a

package-build-get-commit: rename to package-build--get-commit

29fe53f

package-build.el: rename a section

4af38b6

package-build--dump: don't show a message naming the dumped file

8c7eb1c

This looked rather silly and probably was only useful while debugging some issue that has been fixed a long time ago.

package-build-archive: no longer show a weird message

84a2b5a

Replace it with a message in `package-build--checkout', which isn't weird, but still quite unnecessary if you ask me.

tarsius force-pushed the refactor branch from 544bb3b to 84a2b5a Compare March 23, 2018 18:17

purcell reviewed Mar 24, 2018

View reviewed changes

purcell merged commit ea9dd92 into master Mar 25, 2018

purcell deleted the refactor branch March 25, 2018 08:10

steckerhalter mentioned this pull request Mar 25, 2018

package-build just broke everything with a "massive refactor" quelpa/quelpa#143

Closed


		(require 'eieio)

		(declare-function package-build-recipe-alist "package-build" ())

Uh oh!

Conversation

tarsius commented Feb 22, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Status

Changes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tarsius Feb 22, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tarsius Feb 24, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

purcell left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tarsius Mar 24, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tarsius Mar 25, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tarsius commented Mar 24, 2018

Uh oh!

purcell commented Mar 25, 2018

Uh oh!

Reviewers

Assignees

Labels

tarsius commented Feb 22, 2018 •

edited

Loading

tarsius Feb 22, 2018 •

edited

Loading

tarsius Feb 24, 2018 •

edited

Loading

tarsius Mar 24, 2018 •

edited

Loading

tarsius Mar 25, 2018 •

edited

Loading