Skip to content

elwinar/starneighbours

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

starneighbours

installation

All that is needed is to install uv, then sync the environment:

uv sync

running

Starting the API can be done either via uv:

uv run fastapi run --port 9000

Or by running fastapi directly:

.venv/bin/fastapi run --port 9000

reference

get starneighbours

GET /repos/<user>/<repo>/starneighbours
Authorization: <github_token>

choices

package management

The project uses uv, because why not. As this is a small, inconsequential project, I might as well try things out.

It's as fast as it pretends, which is already something.

http server

In the same vein, FastAPI seems to be a modern-ish server for doing HTTP APIs.

For a small 20-lines project this is ok, I don't really like the parameter declaration style tho, it's way too much implicit for me.

github api

Rather than doing HTTP requests manually, I decided to use one of the many reputed libraries for interacting with github, which ensures I can focus on the feature instead of re-writing a client by hand.

The pagination management with iterators is nice, especially the iterator giving details about the number of items, etc.

auth

The endpoint is authenticated using a Github token. Considering we are querying the Github API, this is reasonable as this API is heavily rate-limited when un-authenticated.

It also spread the rate-limiting load to the different callers, ensuring fairness in the case of a multi-user context.

caching

The caching strategy used here is a very simple library-level cache, backed by a sqlite database, thanks to a library designed to do exactly that.

This is primitive but serves for a development context, as this speeds-up the testing cycle without too much hassle.

testing

I haven't done any test. For something that simple, I don't feel like this is needed, a test is only useful if it fails and this one wouldn't. Also, most of the heavy lifting is done by the libraries, so I would mostly be testing PyGithub, which is not very useful.

If the endpoint logic complexified this could be justified tho, in which case a mock of the Github API would probably be the way to go, either by replacing the PyGithub object or (preferably) by starting a fake Github API with controlled data.

improvements

caching

The caching strategy is primitive but efficient. If we where to have many users this could quickly grow out of hand tho, and the disk-backed cache should be replaced by an in-memory cache for pure speed, or even an external cache like redis if we want to share the cache between instances.

Another improvement would be the use of a singleflight-type library (example here https://pypi.org/project/singleflight/) to avoid doing duplicated calls when encountering bursts of calls. This could improve scalability by serving more requests with less resources: I've seen a few serial-starrers in my own repositories, so it reasonable to assume that under heavy load we could end up with multiple calls requesting the stars of the same user. This would also improve latency in those cases, as the later call would finish faster than if it was done completely.

Smart cache invalidation would be to use the star events of the users, and keep a database as cache. The problem we're solving here could be solved easily with a relationnal database, where we would need to initialize an user once and just use the events since the last crawl to update the cache.

parallelism

The calls to Github are done in a sequentials manner, which is OK for small repositories but quickly becomes long when encountering serial-starrers. A pipeline-type model where one routine does the stargazer listing and a worker pool do the stars listing for multiple stargazers at the same time.

With the per-call token auth we have now, this would need to either be a per-call pipeline, of a global pipeline and the need to transmit the token to use.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages