Skip to content

Massive refactor#15

Merged
purcell merged 37 commits intomasterfrom
refactor
Mar 25, 2018
Merged

Massive refactor#15
purcell merged 37 commits intomasterfrom
refactor

Conversation

@tarsius
Copy link
Member

@tarsius tarsius commented Feb 22, 2018

Subsequently I am criticizing the existing code quiet harshly. I am pretty sure that @purcell and @milkypostman will understand that correctly, but others might read this as well, so I would like to emphasize that this is not intended as a criticism of the authors or their skills. I maintain a project similar to Melpa and as part of that maintain tools similar to this one, and know how much work it is to just keep things going. It is very tempting and often the only course of action that won't make you burn out or otherwise go insane to just keep adding features and fixing bugs without taking a step back and refactor.

Motivation

  1. A long time ago I have said that I would like to replace the timestamps that are used on Melpa (non-stable) with version strings of the form VERSION.UPDATE-COUNT. Or something like that. I think that the plan outlined in Evaluating a new version numbering scheme melpa#2955 is fairly sound, but I haven't looked at it in a while.

    Many others have expressed interest in something like this but nobody has implemented anything. Anyway the point is that I think it should be fairly simple to do that. But when I went looking for how to inject that feature into the existing code I ran into a lot of accidental complexity.

  2. I am the maintain the Emacsmirror, which in many ways is similar to Melpa. The primary output of Melpa is a collection of package.el-compatible packages, while the primary output of the Emacsmirror is a database containing a list of Git repositories containing Emacs packages, as well as lots of metadata about these packages.

    For most users the output of Melpa is more useful, but I am happy that there now are a few users who use my Borg package manager and my Epkg manager.

    But my work on the Emacsmirror benefits Melpa users as well. In the early days of the Emacsmirror I went looking for packages to add myself. Nowadays I rely on Melpa for that purpose. Some time after a package is added to Melpa, I also semi-automatically add it to the Emacsmirror.

    Because the Emacsmirror extracts more metadata about packages than Melpa does, that often leads to the discovery of issues that @purcell did not catch during his (very valuable) review. I then contact the maintainer and ask them to fix the issues, or even do it for them. These fixes make it into Melpa as well on the next update.

    Additionally, even though I don't use package.el myself, I am a top contributor to Melpa (right after the two head honchos). By comparing the data found in Melpa and the Emacsmirror I not only find issues in individual packages but also to issues in Melpa recipes and that leads to a lot of commits.

    Anyway the Emacsmirror extracts a lot more metadata about packages than Melpa (or that package.el currently supports (give epkg a try)) and some of that could be ported to Melpa (and package.el). But again the accidental complexity gets in the way.

    The Emacsmirror too could benefit from Melpa more if package-build.el were more similar to the respective tools used to maintain the Emacsmirror.

  3. The reason I am doing this now is that after the next release Magit's development will move to two long-lived branches (likely maint and master). I would like Melpa Stable to continue to contain releases such as 2.12.1, Melpa to stay on maint on offer versions like 2.12.1.5, and my own Elpa archive to offer versions like 2.99.101 from master. For that I need VERSION.UPDATE-COUNT support, as well as some other things that I don't want to implement on top of the current package-build.el implementation.

Melpa could be improved in many ways, and many users request such changes, but when they are told by the overworked maintainers to lend a hand not much happens. I think this is partially due to the accidental complexity that makes it hard to add something new. The primary goal of these commits is to remove some of that complexity so that package-build.el and by extension Melpa can grow new features.

Status

  • I have only lightly tested these changes and I guess I should recommend that this pull-request isn't merged until we have done some more testing.

    Similar pull-requests that I created in the past tended to have some bugs. These bugs were rather silly and "easily preventable", but that wasn't only the result of too little testing but is also to be expected when replacing accidental complexity with a cleaner implementation. When there are many unnecessary complications, then it is easy to mistake a complication that actually is necessary for one of the unnecessary ones.

    So maybe it wouldn't be a bad idea to merge this pretty soon anyway, but on a day where we all are available to quickly fix any regressions encountered by users.

  • This is only a step toward a cleaner code base.

    There are some temporary hacks. I have removed many obsolete doc-strings without replacing them with new ones, because I only have so much motivation to improve things incrementally. I am doing it for the code, but the documentation I don't want to update until things are done instead of over and over again.

    Also I have to turn my attention to other things now. So it could be a while until the next patch series.

Changes

  • Make code that determines the appropriate version self-contained.

  • Use fewer arguments to pass data around.

  • Use Eieio objects to represent recipes instead of plists. These objects contain more information than the old plists, which is what makes it possible to use fewer arguments to pass information around.

  • Replace leaky abstractions with self-contained functions, doing one thing only, and all of it.

    This too is made easier by the use of objects that contain all the necessary information with no overhead such as having to add additional arguments to many functions just so that the work can be done where it makes most sense to do it. Previously some work was done in the wrong place, simply because that was the only place where the required information was available.

  • Organized the file into logical units by using sections. This was a very necessary step, which I already began in earlier patch series. It helps a lot recognizing accidental complexity and removing it.

    This required moving some existing function definitions. The order of the sections themselves and the order of definitions within a given section is still fairly random.

  • Remove many helper functions, by inlining them. These helper functions often existed because the functions that called them were to complicated, making it necessary to move at least some of the code elsewhere. But after cleaning up the calling function that often wasn't necessary anymore.

    The existence of many tiny helper functions that are defined in weird places and have sub-optimal names and doc-strings, makes it very hard to fully grasp what a certain task involved.

    Having everything in one place helps a lot here. Which isn't to say that tiny helper functions are bad per se, but at this point in the evolution of the code base, I feel they do more harm than good. Also they often existed to cover up another issue, not to increase the separation of concerns.

This patch series does not fix all of the mentioned issues. It fixes some instances of each of the mentioned issue categories, while leaving others untouched.

:abstract t)

(defun package-recipe-lookup (name)
(let ((plist (cdr (assq (intern name) (package-build-recipe-alist)))))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assoc-default takes a test arg; IIUC (assoc-default (intern name) (package-build-recipe-alist) #'eq) should do the same.

Copy link
Member Author

@tarsius tarsius Feb 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what would be the benefit of using assoc-default here? The existing code seems to do exactly what I want it to do. Do you want it to do something else? Why?

I don't see how this is an improvement.

-(assq ...)
+(assoc-default ... #'eq)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes the intent to get the key rather than the pair clearer to me. Up to you, of course.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assq is indeed more idiomatic in this case; the intent is clearest (to me) with that.

;;; Classes

(defclass package-recipe ()
((url-format :allocation :class :initform nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all the :initform nil specifications really necessary? One would assume that nil would be the default anyway.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are, in fact, necessary.

(defclass foo ()
  ((foo-field)))

(oref (make-instance 'foo) foo-field)
;; => Debugger entered--Lisp error: (unbound-slot foo "#<foo foo>" foo-field oref)

Copy link
Member Author

@tarsius tarsius Feb 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just because it is an error to access an unbound slot doesn't mean that we have to make sure it is never unbound. I have just decided that preventing that kind of error would be a good idea during this stage of development.

:abstract t)

(defun package-recipe-lookup (name)
(let ((plist (cdr (assq (intern name) (package-build-recipe-alist)))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume plist here is the actual recipe submitted by contributors? e.g., recipes/magit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the cdr of that.

tarsius added 26 commits March 23, 2018 19:17
And move the definitions of `package-build-reinitialize'
and `package-build--read-from-file' in the process.
Do not print the full source and destination file.  That is so noisy
that nobody will actually inspect the output.  Also prefix every line
about a renamed file with "!".  That way it might happen that someone
notices if something strange is going on due to an invalid file spec.
Magit hasn't been doing that for a few years now.
The new value is more to the point and also better matches the name
of the variable all callers assign the value to, `tag-version'.
...instead of using a REGEXP to find tags between the point and
BOUND.  The new, optional REGEXP has nothing to do with the old,
removed argument by the same name.
Also change all callers to use `cl-destructuring-bind'
to split the returned value apart.
Add a new argument DESTINATION, which controls where output goes.

If it is nil, then it goes into the buffer named
"*package-build-checkout*", which is very useful because previously
most callers had to make that buffer current to keep this function
from cluttering some buffer.

If it is t then output goes into the current buffer, which is useful
because for a few callers that is the right thing to do.

If it is an actual buffer, then output goes into that buffer, which
isn't really used, but makes this somewhat consistent with built-in
functions that run sub-processes.
REPO is ambiguous - it might refer to the local repository or to the
url of the remote repository "connected" to that repository.  URL on
the other hand is quite clear.
REPO is ambiguous - it might refer to the local repository or to the
url of the remote repository "connected" to that repository.  URL on
the other hand is quite clear.

The "git"/"hg" part won't be needed anymore once we switch to using
eieio.  So you can ignore that noise.

The "used" is added to differentiate these functions from the method
`package-build--upstream-url', which will be added soon.

Also move the definitions of these functions.
Instead of passing its value around, consult the variable
`package-build-archive-dir' when its value is actually needed.
Instead of passing the value around, determine it once it is needed.
Add a new function `package-recipe--working-tree' to do so.  The name
of that function does not match the name of the library that defines
it because it will be moved to a library with a matching name in a few
commits.

Don't do this for a few functions that will be removed soon anyway.

In many cases the DIR argument is replaced with a new argument NAME
because `package-recipe--working-tree' needs to know for what package
it should return the working tree.  However in a later step that new
argument as well as the existing CONFIG argument will be replaced with
a single argument RCP.  What is being done here is but a step in that
direction.
Remove the unnecessary hand-crafted method dispatch mechanism and
replace it with a simple `cl-case', which is more than sufficient,
considering that only one of the "methods" actually is implemented.

Regardless, the next commit will turn this into a proper generic
function.
... absorbing `package-build--update-git-to-ref',
`package-build--git-head-branch' and some code from
`package-build--checkout-git'.

Because this function will soon be absorbed by another
new function, don't bother coming up with a better name.
tarsius added 11 commits March 23, 2018 19:17
... replacing PACKAGE-NAME argument.  That way we can use the
information that is contained in RCP, which we will start doing
in the next commit.
Previously its task was performed by `package-build-archive',
which shouldn't have to concern itself with this sort of thing.
And remove `package-build-add-to-archive' in the process.
This looked rather silly and probably was only useful while
debugging some issue that has been fixed a long time ago.
Replace it with a message in `package-build--checkout',
which isn't weird, but still quite unnecessary if you ask me.
Copy link
Member

@purcell purcell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This looks like a solid piece of work, but it's also extensive, so I'm consequently not sure how best to fully assess & communicate the likely downstream effects not just on MELPA but on Cask, Quelpa etc., and any other code that uses this.

However, I think if those circular references are resolved, it would be fine to merge this here, then I'd do a bit of testing and move MELPA over to this version.

@@ -0,0 +1,66 @@
-include config.mk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this Makefile exist only to generate autoloads? Not sure why this would be necessary at all.

Copy link
Member Author

@tarsius tarsius Mar 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makefiles such as this one (I use very similar ones for most of my packages) are the reason why my packages have significantly fewer compile warnings than other elisp. It's just very simple to type make and learn whether you only didn't get any warnings when compiling from within Emacs because you happened to also have the unspecified dependencies loaded among other repl vs. separate compile step issues.

Yes, it is a a simple Makefile, but that doesn't mean it is worthless. Rather that's the result of reducing it to the essential parts over the course of a few years. Other people use Cask instead. That looks much more impressive not least because it adds a dependency on Python and comes with a manual and does project specific dependency management (which makes sense for the build systems that idea was borrowed from, much less for Emacs which doesn't support loading different versions into the "vm").

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, cool. I'd be happy to end up with a Travis build for this, and the makefile would be useful in that case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've started working on that but need you to enable it at https://travis-ci.org/profile/melpa.


(require 'eieio)

(declare-function package-build-recipe-alist "package-build" ())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This circular reference indicates to me that the division into files isn't quite right here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but I am also not done...

(old-names :initarg :old-names :initform nil))
:abstract t)

(defun package-recipe-lookup (name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I think these bits don't belong in here, though I really like the introduction of the data structure and github/hg/bitbucket specialisations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... I tend to come to the opposite conclusion. I think all the "read recipes from filesystem" stuff should go into package-recipe.el.

Or maybe not, but what I have done here is to start implementing a new interface to the data stored on disk, while keeping the old interface intact. The circular reference is due to the new interface being just a stub around the old interface for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, if it's not the end game, I'm cool with that.

(plist-get config :url)))
(funcall (intern (format "package-build--checkout-%s" fetcher))
name config (file-name-as-directory working-dir))))
(cl-defmethod package-build--checkout :before ((rcp package-recipe))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure this code is fine, but it's also much less easy to understand than before IMO. I haven't worked much with the eieio objects + cl-defmethod style, and other contributors likely haven't either. Doesn't mean I don't like it, but that's a potential consideration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I know but ... ah I see you are talking about the :before. This code could also be moved into package-build-archive. I would prefer to leave it like this for a while though. It just messages some information and is rather irrelevant to the flow of things, which makes it desirable to separate it from the important parts. Once those parts are finished, the messages can be moved to the place that eventually turns out to be the correct one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think I would have liked to remove these messages completely but kept them to avoid unnecessary changes.

Copy link
Member Author

@tarsius tarsius Mar 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the context in which these three lines are printed:

 • Building package magit ...
Package: magit
Fetcher: github
Source:  https://github.com/magit/magit.git

Cloning https://github.com/magit/magit.git to /home/jonas/git/emacs/melpa/working/magit/
  • Package: magit is unnecessary, the same information is printed right above.
  • Source: https://github.com/magit/magit.git is unnecessary, the same information is printed right below.
  • Fetcher: github, while not redundant, also isn't really useful here. This is not the place were one would make sure that a recipe author used e.g. github instead of the generic git. Also despite this output you occasionally overlook such a mistake. I do notice and fix them later on, but I rely on other tools to do so.

So let's just remove this.

(package-build--checkout-hg name (plist-put (copy-sequence config) :url url) dir)))

;;; Utilities
(cl-defmethod package-build--checkout ((rcp package-hg-recipe))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice though.

(defun package-build--package-buffer-info-vec ()
"Return a vector of package info.
`package-buffer-info' returns a vector in older Emacs versions,
and a cl struct in Emacs HEAD. This wrapper normalises the results."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should start normalising into a cl struct...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be better I agree. The easiest way of doing that though would be to simply drop support for older Emacsen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be open to that. Any idea when the switch was made?

(package-build--message "%s => %s" file newname)
(copy-directory file newname))))
(package-build--message
"Copying files (->) and directories (=>)\n from %s\n to %s"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

@tarsius
Copy link
Member Author

tarsius commented Mar 24, 2018

I'm consequently not sure how best to fully assess & communicate the likely downstream effects not just on MELPA but on Cask, Quelpa etc., and any other code that uses this.

We could just use it for Melpa for a while.

@purcell purcell merged commit ea9dd92 into master Mar 25, 2018
@purcell
Copy link
Member

purcell commented Mar 25, 2018

We could just use it for Melpa for a while.

Sounds like a plan!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants