Codeberg: a free, non-commercial GitHub alternative(codeberg.org)
https://codeberg.org uses the https://gitea.io software for its free remote Git repository service. If you're uncomfortable with non-free software, open-core, commercial underpinnings and/or corporate ownership of Github and Gitlab, hosting your repos on Codeberg — or starting your own Gitea site — are very viable alternatives.
(Edited for typos)
Thanks, this was very helpful! I was scanning the prose on the Codeberg website looking to see if Codeberg itself was open source, but all it said was that it was "free" and "non-profit."
Now I see that Gitea is the software (basically a clone of GitHub that is visually almost div-for-div, released with an MIT license, that you can host yourself), and Codeberg is central hosting (a "hub") running Gitea, operated as a non-profit.
Gitea is a fork of gogs.
Gogs is still in development as well. If people are interested, definitely check that out as well.
If you don’t need issue tracking then also check out cgit.
Gitea is very slow (eg 4s GET requests) for large codebases. cgit is like lightning.
I had developed the impression that gitea kept all the project metadata -- bug tracking, PRs, etc -- in the same repository with the project code. Was this impression correct? Looking at the "Comparison" page does not suggest anything, one way or the other.
To me that would be the most important differentiator. Do any of the others do it?
It's not the case, the PRs, bug tracking and other metadata are stored inside an SQL database:
Here are the tables on my at home deployment:
Edit: delete a few lines in the list of tables, it was way too long for an HN post...
mysql> show tables -> ; +---------------------------+ | Tables_in_gogs | +---------------------------+ | action | | attachment | | collaboration | | comment | | commit_status | | deleted_branch | | issue | | issue_assignees | | issue_user | | milestone | | mirror | | notice | | notification | | oauth2_application | | org_user | | protect_branch | | public_key | | pull_request | | release | | repo_indexer_status | | repository | | review | | star | | stopwatch | | task | | team | | topic | | tracked_time | | update_task | | upload | | user | | watch | | webhook | | [...] | +---------------------------+ 67 rows in set (0.01 sec)
Wait, is that something you would want? I would find it pretty awful if my git history for my code had all the other things like issues and PR metadata in it. Could potentially see it being useful if it were stored in an "auxiliary" repo that I could pull separately.
The big advantage would be that all that metadata would also be decentralised, i.e. I'd have a backup of all that on my own PC, and I could push it to a different host to automatically transfer everything, whereas e.g. GitLab has had to write custom import scripts for GitHub metadata, and you just have to hope that everything maps cleanly if you use it.
Of course I want it. History is history. Furthermore, why should PRs, bug reports, and build artifacts not be able to refer directly to source objects and revisions, and vice versa?
A simple way to represent it might be as an ASCII dump of one or more databases representing it all, so you could roll back to any day's state; or as an update-log transcript, so you could roll back to the state after any given transaction. Both amount to successive diffs, so are reasonably compact.
> History is history
Except PRs, bug reports etc. can't affect the runtime behavior of your program, only code can. There are many times when I am looking through a dependency's history to see "what changed?" in order to decide if I want to pull a new version, and the ONLY thing I want to look at in that case are code changes.
> Furthermore, why should PRs, bug reports, and build artifacts not be able to refer directly to source objects and revisions, and vice versa?
I do that all the time in GitHub today.
I certainly understand wanting that auxiliary data to be versioned, and I also understand wanting it to be in a portable, distributed format like Git. I just definitely don't want it cluttering up the history of my code
Also, security is a big deal. I often want to allow pretty much anyone to open a PR or an issue, but I don't necessarily want that now to be part of the repo every time someone clones it.
> Except PRs, bug reports etc. can't affect the runtime behavior of your program, only code can
Are your repos free of docs, READMEs and tests, then?
> I just definitely don't want it cluttering up the history of my code
> I often want to allow pretty much anyone to open a PR or an issue, but I don't necessarily want that now to be part of the repo every time someone clones it.
That is a good point; perhaps solvable by having the webapp frontend provide the GUI but make commits itself? That should give you the best of both worlds.
Fossil keeps them in the same repo but as a separate kind of object.
It's difficult to see how gitea is anything other than a copyright infringement of GitHub.
What are they infringing on? Site layout? There are tons of Github alternatives that all follow similar layout and styles. Gitlab, Gogs, Bitbucket...
Gitlab was basically a pixel-for-pixel clone in the beginning before they started differentiating themselves on features and then developed their own UI. Made it super easy to onboard new employees 5+ years ago, honestly.
Please do yourself a favor and google "copyright".
Look and feel is definitely copyrighted.
Their half-translated UI (my browser is set to polish) is extremely off-putting.
So many people seem to think that providing low quality translations is improving accessibility. It isn't. It's doing the opposite, pure english websites provide much better UX to international users.
BTW, Microsoft with their automatically translated MSDN docs is by far the worst offender in this area.
That's the case because you speak English. Think about the last time you had to navigate a website in a language you didn't speak (in my case e.g. Chinese or Russian). I remember being very happy with a few (incomplete) clues in English about were to look and what to expect.
By all means, advocate for making it easy to change the language back to the original. But this stance will decrease accessibility.
In case of technical websites, you have to know English anyway, because even if there is a broken translation of the documentation, function names, enums and all the cool stuff remains in the original language anyway.
> you have to know English anyway
You don't, plenty of developers are getting around having to know English as there are other resources for learning.
I think it's a dangerous assumption that everyone knows English. It's especially true in the Spanish-speaking world, where I know developers who can't have a conversation in English but are good enough to be employable.
I would rather just use google translate to do it automatically on the entire page, and yes I do that quite regularly from German.
Especially when you never see the subject matter translated anyway, which would be the case with Git. There aren’t any conventional translations of words like “repo” or “pull request” in most languages, and when you try you risk making it incomprehensible, or comical.
For example, take their Swedish translation of “pull request”: “Pull-förfrågning”. They had to leave “pull” untranslated, probably because it would be impossible to understand if you actually were to translate it word by word; unfortunately “pulla” is a sexual slang word in Swedish , so the whole translation sounds like a request for sexual assistance.
Eh, "pulla" is not the correct Swedish translation for "pull". It would be "dra" or "rycka". There is a GitHub repository where bunch of Swedish developers have tried to translate the entire git vocabulary into Swedish, you can see it here: https://github.com/bjorne/git-pa-svenska
Table shows that "pull" is usually referred to as "pulla" but the suggestion is to change it to "rycka". If you do know Swedish, there is a discussion about the naming of "pull" here: https://github.com/bjorne/git-pa-svenska/issues/30
I think it very much depends on the the country.
In Sweden, you can assume that pretty much everyone understands English, so if your website has an incomplete Swedish translation, might just be better to go with English one anyways.
But in Spain (outside the big cities), not that many people do understand English. In that case, it'll be better to serve a incomplete Spanish version instead of in English, as otherwise they'll be completely lost.
So as always, I'd say it really depends, and it's hard to generalize over all "international users".
I think the important question is about the target audience of this site: how many of each language's programmers know no English?
I'm a native English speaker, so I don't know.
French here. I know lots of developers who don't speak English well. They know specific words that are used in language constructs (while, function, class) or usual software vocabulary (provider, database) but they lack the rest: the grammar, non technical vocabulary, everything.
They maybe do not speak a perfect English but know it enough to read docs, blogs etc.
France has an obsession about its language (which is wonderful) and believes that the whole world should adapt (which is dumb in tech). You should attend some union meetings where all docs are expected to be in French "because that's the law". The reality of the world is not a concern of theirs.
Source: I am French, work in tech and see how much this illusion of "everything in French" hurts us. If you do not speak English then either learn, or work outside dev.
Exactly, this is true in many places in Spain and latin-america as well.
You wouldn't be able to have a software design session in English, but they would understand specific English words and even use them when speaking Spanish generally.
> So many people seem to think that providing low quality translations is improving accessibility. It isn't. It's doing the opposite, pure english websites provide much better UX to international users.
Good to know
Personally, I kind of like the balance of gogs/gitea on that subject.
You can switch language quite easily using the menu in the bottom center/right.
Also, in a self hosted context you can disable/enable languages in the configuration file easily.
Also, if you find the translation really bad (I think in most cases, the default translation is from MS automatic translation), you cam easily contribute to it:
I personally contributed to the French translation a few years ago, which was really bad in Gogs at the time, specially with some over translation of technical terms such as "git commit" which was translated to "git commission" (a bit weird and laughable since "faire une commission" can be a weird formal way to say "pooping" in some contexts).
I took me a few hours to go through a few of the message strings and propose an hopefully better translation. It's quite easy to contribute to the project in this manner, far easier than contributing code.
Lastly keep in mind that gogs was created by a Chinese developer, and I'm half expecting that Chinese developers are not as fluent in English as developers from India or Europe. In this context, pushing for an internationalized service is a good call vs an English only UI.
Localization is a difficult problem no one has solved yet. If you find a solution please write about it.
Not translating isn't really a solution.
Wanting everything to be 100% translated is also unrealistic.
When the audience is programmers or generally tech people, not translating from English might indeed be the best option. The lingua franca of technology is English.
E.g. looking at the quality of Microsoft's German translations (which seem to be done by a poorly trained bot), the translation is actually counterproductive, because it's full of special terms translated into German that either would not be translated from English in a "native" German text, or where a different translation is commonly used.
My favourite is the Visual Studio translation for Link-Time Code-Generation. Which is translated as "Link-Zeitcode Generierung" which in English means "Link Timecode Generation".
It is a tough one and yes, here in Germany I would expect far most developers to understand enough English to use English tools and documentation. (Exceptions exist, notably with old technology like Mainframes)
In other countries like Russia or Japan I noticed, as an outside observer, more reliance on translations.
When translating a complication is the degree to which one does it. There are terms which can be translated and some where it becomes weird. In some parallel comment "pull request" was mentioned. Translating that certainly becomes weird, but translating "file" as "Datei" is well established and - when translating - expected.
Even not great translation helps younger wanna be programmers. They have better chance to learn. It takes quite a lot of time to learn English well enough to understand documentation.
How can you say that localization is a difficult problem that no one has solved yet. i18n is the case for as long as I’ve been writing software, and probably longer. So minimum 20 years.
I do agree that it’s not a walk in the park because it takes loads of resources to provide actual translations and one has to make i18n a feature in the code. But... come on. phpMyAdmin was doing this in 2004.
Even in the 1980ies Microsoft employed linguists for translating terms like "scroll bar" into different languages.
However finding the right translation level in technical context is hard. For which concepts should one translate the name and which terms should one borrow the English term? The French translate "computer" as "ordinateur", the German "Rechner"/"Rechenmaschine" however is outdated.
Now what about a git repository clone and commit? How much translation will destroy understanding?
You're unfortunately still stuck in thinking you can just take a string and make it another string and boom you have i18n.
Unfortunately that's misunderstanding that fundamentally language relies on way more context than just strings to get the same message across.
You can't just simply translate and have a good UX for other languages.
So far I haven't had any issues, and I am an English-speaker.
We are missing the trees for the forest here. I think that Codeberg is not communicating their core values the best way and my understanding is: to be an open source, ethical, safe home for open source projects under Non-Profit Collaboration Community. If your repo were ever (even temporarily) disabled by a false or intentionally bullying DMCA take down request, you would know right away why every single word in the above sentence is important. And, hosted in Europe, under EU laws.
This is a nice summary: https://blog.codeberg.org/codebergorg-launched.html
Suggestions on how to make the messaging of Codebeberg clearer would be appreciated, Codeberg has an issue about this in their tracker: https://codeberg.org/Codeberg/Community/issues/75
Edit: added link to Codeberg issue tracker
> I think that Codeberg is not communicating their core values the best way and my understanding is: to be an open source (...)
But is it? Where can you find the code for codeberg itself? Is it possible to download it all and install it locally on an air-gapped server?
See https://sourcehut.org/ for a vastly different source code hosting solution. If you're interested in something other than an open source GitHub clone, then that is more likely to interest you.
SourceHut is great, it still missing organizations support though (they promised to add it in a few months). The only problem compared to Gitea is deployment. SourceHut written in a mix of Python and Go, while Gitea in a pure Go. Gitea is essentially one binary. I wish SourceHut would dump Python and rewrite everything in Go.
"Everything is one binary", while convenient, just means "monolith", and usually "unscaleable monolith". That has nothing to do with Go. Plus, compiled-in filesystems and runtime-extracting applications are a thing, so anything can be a binary.
Sourcehut is specifically designed to be a composeable set of services, so regardless of language, it's never going to be one binary.
> just means "monolith", and usually "unscaleable monolith"
Hm, monolith != unscaleable monolith. Monolith just means you'd have to do vertical scaling instead of horizontal scaling, which arguably is easier and less error prone, as you still are only dealing with one instance and not building a distributed system all of sudden.
I usually try to keep to vertical scaling for as long as possible, until either the performance starts to plateau or the product/project becomes big enough that it needs different backend dev teams working on different things concurrently.
I did not say "monolith" equaled "unscaleable monolith", I said that such a setup is usually an "unscaleable monolith".
Thee reason for this is that writing anything at all to be scalable is quite difficult, and larger monoliths tends to reflect a laziness and false sense of simplicity that tends to also be reflected by the internal design. For this reason, I'd consider "large but scalable monoliths" to be in the minority.
Furthermore, I'd argue that if scaling of a monolith isn't a problem, it's simply because its load is insignificant in the first place. Once the scaling game starts, vertical scaling quickly ends up being infeasible.
Of course, that does mean that you can start out and experiment with a monolith while a product is young, but I'd prefer to have hashed out the overall design before I start dealing with production systems.
I'd design components around clearly distinctive areas of functionality rather than the distinctive teams that need to write it.
> See https://sourcehut.org/ for a vastly different source code hosting solution
No imprint, no privacy statement, no deal.
Imprints on websites are only required in a few countries. Frankly, I find it absurd that on the one hand, my contact details as a domain owner are being removed from WHOIS citing data protection, but on the other hand I am required by law to put the same information on the website I'm running on the domain.
But sourcehut is a commercial site.
I am not going to pay money to some organization when I don't even know who that is and where they are located.
Believe it or not, hosting costs money. It's just that you're not the customer at GitLab and GitHub with free accounts, but simply lure to bring in paid customers. This means that your interests is not what drives company decisions.
Sourcehut as a company is designed to make the relationship between you and the company clear: You are the paying customer, and funding the services. There is no secret agenda or external investor whose interests they must please. Plus, it's the most transparent company I know of.
While you could just ask your questions in #sr.ht, you can also read the financial report (https://sourcehut.org/blog/2020-01-13-sourcehut-q4-2019-fina...) and see what colos they are paying for. At the time, hosting was in San Fransisco and Philadelphia. You can also use their Prometheus instance to check how things are running if you'd like (https://metrics.sr.ht/graph). Both guys in the company's face can be seen and profile read here (https://sourcehut.org/consultancy/).
I don't think your questions can be answered to the same degree from other hosting providers, despite imprints.
That is entirely your choice. In.my experience, there are many companies out there that provide good services, but don't publish any address on their website. I am sure that you could get one if you ask nicely. It just doesn't make sense to post that info publicly in every case.
Very interesting. Thanks for the link. I will definitely be looking at this.
Sourcehut is very, very different. I'd like to know how this is working for devs.
I love it, I can't say bad things about it. I can't wait until the "hub" service is finished, it will allow you to combine any number of the other services into some sort of project structure. I love the builds service, too, it's really nice and easy to use. I use the todo service as my life tracker.
Yep, the builds service is nothing short of amazing. It's super useful to me because it lets you easily target a huge number of systems, including BSD systems that couldn't be supported through Docker or the like.
They added the ability to take build artifacts now which is great. I haven't used it yet, but I plan to use it for my static gen site eventually.
I'd love to be able to use my own image, though, I feel bad everytime my build script has to use 3 minutes of build server slot time to download and install a bunch of packages before it actually builds my project, if I could make a custom docker image to build in, or there was caching, it would use less time.
I am curious; what does this gitea clone provide over the hundreds of other clones I could be using that are also free and non-commerical?
Great question. Let me copy/paste explanation of one of the founders:
Collaboration platforms like SourceForge, GitHub, and GitLab enabled and facilitated the development and organization of free software projects. But as developers of free software we fell into a trap – we poured invaluable source code, documentation and last not least a huge and steady stream of data about our personalities, interests and social networks into proprietary platforms. These platforms are no charities, they operate to fulfill the commercial objectives of their founders and owners, outside of our control. They are valued and sold for billions of dollars. Codeberg is trying to fix that. The nonprofit Codeberg e.V. has been founded as independent non-government organization to launch and build a free home for Free and Open Source Software.
Also see this short SFSCon talk:
Thumbs up for mentioning this, and thanks for your effort. I've always found it odd (to say the least) that F/OSS projects are happily giving away their dev's and user's eyeballs and clicks to commercial entities when many F/OSS projects advertise altruistic or other ethical motives and licenses. Using github.com (or any other project hosting service) for your project's home page also ruins the web and puts all the power in the hand of very few when regular DNS and web hosting has regulated rules where providers and registrars can be negotiated with, switched, appealed, etc. Not to mention that github.com blocks indie search crawlers, adding to the search monopoly we have.
Sorry for the late reply, but looks great. Especially with the voting system!
(side note: those slides are amazing lol)
I suppose it's the service, not software. Much like Github is first of all a service, a site.
So compared to gitlab and sourcehut (my other favorite options for free github alternatives), this is librejs complaint (sourcehut has this, don't think gitlab does), and has a github-like UI. It seems to be pushing the privacy angle and is hosted in the EU. Any other big reasons to prefer this over gitlab/sourcehut/self-hosted?
There is also OneDev: https://news.ycombinator.com/item?id=22081419
> this is librejs complaint
What does this mean/imply exactly?
It's a typo of "LibreJS compliant": https://en.wikipedia.org/wiki/GNU_LibreJS
I couldn't find any "about" page. Who owns it? Who are the people behind it?
Here's some info from their blog https://blog.codeberg.org/codebergorg-launched.html
You didn't find it because you didn't look. Scroll to the bottom of the page and click on "Imprint". Quote:
"Codeberg is a non-profit organisation dedicated to build and maintain supporting infrastructure for the creation, collection, dissemination, and archiving of Free and Open Source Software. If you have questions, suggestions or comments, please do not hesitate to contact us at email@example.com."
It's a kind of club ("eingetragener Verein" ) that is funded by its members and through donations.
I think a lot of people are probably unfamiliar with any definition of "imprint" that would cause them to look there. I know I can't explain what it means in this context.
It comes from Impressum, essentially a form of identification for publishers, but imprint is the English word used for the German legal requirement.
Your comment is very helpful, but starting it with "You didn't find it because you didn't look" is not very fair. Most native English speakers would not expect 'Imprint' to have that information.
FWIW I saw the Imprint link at the bottom, but I didn't know that's where I could find company info. However, there was a link to their mission statement in the middle of the page which includes the same info.
Still not very informative. No names, no funding info (who, how much, can I help?).
To me, not being able to divine the people behind the I iniative is a bit incongruous to being a community effort. Makes it feel like a company, not a club.
It's a German registered association (e.V.), i.e. it's not only a single company or person behind it. Funding is done by membership fees. As a member you also get insights (regular newsletter) into the financial situation, e.g. how much money is on the bank account.
Finally an alternative go Github and Bitbucket hosted in Europe, great.
Does anyone else think Github now has a name recognition advantage? I am not a super ninja developer so I still have to make resumes, and in there I can just put github logo and a username there. Saves space and looks quite professional. I can't do that with newer services. Or am I just being over cautious here?
Edit: And its not just about resumes, its the case for getting traction for your project too. I feel comfortable contributing to a project hosted on github and am sort of weary of things hosted on other places.
Having a remote git repo monoculture where it has to be Github or nothing is unhealthy for the software ecosystem.
Gitlab is a good alternative, but why not have more choices? Anybody who might want to hire you can click a link to a Codeberg/Gitea site and see what looks like a very familiar repo. It should be the project itself that is attracting users and devs, not where the git repo is remotely hosted.
For free-software projects, a Gitea site like Codeberg might be an even better choice because of the conflicts involved in using a commercial entity like GitHub that uses closed-source software yet bases its entire business on a free project (Git).
repo.or.cz — one more free hosting alternative to GitHub, especially for FLOSS projects.
Its mobile UI is broken all over on my system. But I am all in for getting inspired by exact Github UX/UI. They're the main reason I don't want to switch to alternatives even if I hate Github after MS-merger and following "improvements of the service".
What changes made after the MS acquisition do you dislike?
Personally I most dislike forced deanonymization with 2fa or annoying me with device verification which locked me out of my anonymous account with old expired email (I didn't need it secured and I don't want to prove anything to MS to restore it too). Some people also reported increasing external CI services throttling in favor of Actions, I wouldn't be surprised if they give preferential treatment to Packages too. In the end I don't want power of MS over my code and workflow to grow too much. I know they claim Github is independent but it's not really reassuring.
This is quite normal for an enterprise product though, and I agree on the deanonymization part. But I would not expect anonymity on such a platform anyway because GitHub is very often used in the place of a resume or a CV.
Simply put, you are not the kind of user they want. Happily there are alternatives. I started backing up private repos to SourceHut.
I would be glad to switch, but I am sucker for good UI. I can even forgive some lost functionality for it. Most alternatives were not to my liking.
I'm on the same boat. For now I don't mind because I use a few other Git hosting services point as a backup while still using GitHub. I'd like to move to another service though, it'll be a conundrum.
It is not clear if this is a free software project. Can it be installed locally? I see "open source" written here and there, but no link to the actual code (besides gitea).
Yes it is, see here: https://codeberg.org/Codeberg/build-deploy-gitea
Gitea can run locally.
Ok, so top comment (currently) is about localisation. A really interesting topic, but has anyone tried it out yet and have opinions on the functionality?
Yes, I'm actively using Codeberg for my OSS projects (I still have some on GitHub, too), e.g. https://codeberg.org/hjacobs/kube-web-view/
Some functionality is lacking compared to GitHub, but it does what it needs to do for me.