Rendered at 05:50:36 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
SwellJoe 10 hours ago [-]
Anything "for agents" needs to provide some kind of evidence it's better than what the agents already have baked into the model training data. It can't just be "easier" on some dimension, because the model has already learned the hard parts of the old thing and models can't make new memories to learn new things, so there is always a context cost for the new thing.
Models know git because there's a monstrous amount of git in their training data. Models never heard of a new thing "for agents", so you have to teach them to use it via skills and docs. Models can, of course, follow documentation, so there's nothing stopping them from using the new thing...but, the new thing "for agents" starts the race well behind the known thing that was built for humans a decade or two ago and has huge amounts of training data baked into every model.
I'm not saying nobody should make new things (an accusation I've gotten when saying something similar about a previous "for agents" thing), of course people should make new things. I'm saying that when I see "for agents", I think, "prove it". Agents don't have trouble with git, so there's gotta be some kind of pain point about using git with agents that I'm unaware of that this solves somehow (but isn't expressed on the page) or this isn't actually for agents, it's just a project someone wanted to do (and that's also fine!). But, if the latter, "for agents" is merely marketing and I'm not interested.
atombender 6 hours ago [-]
I'm not sure I understand this argument. I create new tools all the time as part of my development work, and I have skills stored that tell agents how to use them. They use them flawlessly.
When I say "benchmark the query engine using the foobar dataset and compare it to run 431", the agents go and run my special benchmark tool and use the different subcommands to compare results and so on.
I'm sure a new VCS would be a little less smooth sailing, but not by much.
sdesol 5 hours ago [-]
I think the issues is, it is going against a very well established pattern. I have a tool that wraps ripgrep so that search results always includes context and from time to time, the agent will use ripgrep by itself and when I ask why, it would go "yeah I should have done that"
There are work arounds though and I am creating what I call knowledge triggers for Pi that are similar to claude's "PreToolUse" so having the agent use oak all the time is not an issue in my opinion.
The challenge for oak is why? Considering how I actually want to slow agents down so I can ensure it is doing the right thing and because the massive bottle kneck is the LLM themselves, speed when measured in milliseconds or even seconds will not concern many.
I thought oak was more of, we know how to prompt inject context based on code that is stored in oak for example, but faster operations can help, but the use case is limited. The missing piece for better/correct code is context at the right time.
nextaccountic 2 hours ago [-]
> I think the issues is, it is going against a very well established pattern. I have a tool that wraps ripgrep so that search results always includes context and from time to time, the agent will use ripgrep by itself and when I ask why, it would go "yeah I should have done that"
There's a limit of how many simultaneous instructions an agent can follow (the exact number depends on the specific model so instructions that are fine for one model may overwhelm another). If this keeps happening, consider trimming your instructions or even better, solving it at the harness level (like intercepting and rewriting ripgrep calls to use your thing, like rtk [0] does in agents that supports this)
Overall, never leave to an agent an instruction that must be followed at all times. For example, doing things in a git hook beats a multi-command workflow every time the agent commit, etc.
Is this state of things forever? I don't think so. Very soon models will become so better this will be a non-problem
yep. claude keeps "habitually" trying to use `rg -rn` instead of `rg -n` because it was instructed to use "rg" instead of "grep" by Anthropic, but uses arguments for grep: `grep -rn`. My instructions and "memory" are not helping. "Oh, I did it again, and you've instructed me not to". Older tools are better for current "agents".
skissane 6 hours ago [-]
> Models know git because there's a monstrous amount of git in their training data. Models never heard of a new thing "for agents", so you have to teach them to use it via skills and docs.
Another option: when model invokes standard tool, rewrite the invocation to newfangled tool.
Bunch of ways of doing it:
(a) Invocation of standard tool returns error saying to use newfangled tool instead
(b) Invocation of standard tool returns message saying it has been dynamically rewritten to invoke newfangled tool, followed by newfangled tool output
(c) Invocation of standard tool in context is dynamically rewritten to invocation of newfangled tool, prior to execution
In case (c), the model ends up thinking it somehow knew about this new thing all along, even though it actually didn’t
citadel_melon 3 hours ago [-]
Options (a) and (b) add more bloat to the model’s context window and option (c) seem to reduce to having similar functions that already existed. There is also the option to trick the LLM that it’s using the old function exactly as-is, while the harness abstracts away a completely different methodology. Cursor often does exactly this: they use an internally built vectorized search when the model calls the default “find” bash command. The LLM is none the wiser that the function’s implementation is completely different.
Regardless, in any of these cases, the implementation for any of these above options may be vastly superior to the “naive” implementation for agents — but then the parent comment here is right that an engineer would need to justify their implementation to users, not just make a loud conjecture. It’s a non-trivial claim to say that a bespoke solution not present in tool-use training and accounting for context-rot would result in a better performing model. Moreover, justifying an agent-specific efficiency gain that humans wouldn’t benefit from makes the claim even more non-trivial. Using Sagan’s razor, it’s then reasonable for people to ask for a comparably non-trivial amount of evidence.
9 hours ago [-]
mrmrs 9 hours ago [-]
Totally correct on the burden of proof here. Agents DO know git extremely well.
There’s a huge amount of git in model training data, and anything new starts
behind because you have to teach the model what it is, what commands to run,
and where the sharp edges are. For us “for agents” does not mean “new syntax
that we hope agents can read docs for.”
The thing we’re trying to optimize is not whether an agent can remember the
command. It’s the runtime shape of agent-driven development.
When an agent drives a VCS through a captured terminal, things that are
tolerable for humans become direct costs: clone/setup time, worktree setup,
full status output, huge diffs, branch cleanup, interactive prompts,
shared-checkout mutation, repeated preflight checks. Those costs show up as
wall time, bytes over the wire, transcript tokens, and recovery steps.
So the Oak bet is narrower than “agents can’t use git.” They can. The bet is
that if you assume branch-per-agent workflows, lots of parallel sandboxes,
large repos, and non-interactive command execution, the VCS interface should
have different defaults if you want to optimize for shipping speed and
efficiency of token usage. If you're already going fast enough and not running
out of tokens - then using oak seems pretty silly.
People do not need to ditch git to try Oak out. One workflow we care about is
letting agents work in Oak where the agent-specific costs matter, then
exporting back to git for the human review, CI, release, or compliance
workflows.
Totally agree this should be provable and benchmarked. The homepage has
Oak vs Git numbers because we do not want “for agents” to just be vibes. We’re
measuring transcript bytes, estimated tokens, tool calls, wall time, large
diff/status behavior, and contention in agent-style workflows. We’re also
working on the benchmarks repo in the open: https://oak.space/oak/benchmarks
The exciting part to me is that we can already improve on tokens and timing
despite starting with the model-prior deficit you’re describing. If we can
win on measured agent workflows while git still has the advantage of being
deeply baked into the models, I’m incredibly bullish on where Oak can get to
as the tool and the ecosystem matures.
Longer term, if Oak proves useful and sticks around, future frontier models
will likely have more Oak examples in training data, which lowers the upfront
learning tax for an extra boost.
fulafel 3 hours ago [-]
How did you speed up things (eg clone, worktree setup) compared to Git? Could the same work for human facing tools?
zdgeier 2 hours ago [-]
Mostly through networked file system mounts with FSKit/FUSE backing when working on tasks in parallel. May be applicable for human facing tools but I think workflows there are already pretty set with having files locally and mounts need some lifecycles that agents are probably better at handling.
oliver236 8 hours ago [-]
[dead]
kjuulh 12 hours ago [-]
I've built my own workflow for using agents on git, as i now often have to do changes across repositories, or in the same repository for different tasks. I could use worktrees, but I'd rather invert it, give agents the ability to have a workspace, that they pull repositories into, create branches as they want, commit on main it doesn't matter. the agents don't bother each other, and when i finally have to merge, conflicts are either resolved, or it is just smooth sailing.
The tool is called gitnow. it is honestly quite simple, just create a project, add the repositories you want and get to building. I've found having another claude chat or whatever use the tool to great success coupled with zellij, but could also be zed, tmux or whatever.
Secondly it also pretty much solves the problem of the agent dumping memory files everywhere, they now basically have a scratch space that is theirs, where they can keep their tasks, and just update the repositories as needed.
Use gn the shell after eval if you use it, it will actually invoke cd, instead of creating a subshell.
I have absolutely no idea what this offers that makes it better than git (or any over VCS for that matter) for agents.
There’s some mention about performance, which is great, but the performance of git isn’t a bottleneck for agents.
There’s some mention about token use being reduced, which is great, but how have they achieved that vs gits porcelain modes. And why does token count require a whole new VCS, and thus incompatibilities with all the established git ecosystems?
I really want to find reasons to like this but it’s probably some of the worst product marketing I’ve seen. And something this significant really does need to sell itself hard if you’re going to get enough people in a project team to agree to switch away from git
laserbeam 10 hours ago [-]
> I really want to find reasons to like this
But why? Why would I want to like a project which seems to invent problems rather than solve any? I don’t want to like this.
hnlmorg 7 hours ago [-]
We don’t know if it’s inventing problems or solving them. That’s the point.
desmondl 10 hours ago [-]
The main site mentions being able to "mount" a branch, vs. cloning a new repo or using git worktrees. And messageless commits for intermediate work. Besides that tho I don't see a compelling reason to ditch git, but looks interesting enough that I want to keep an eye on it
y1n0 7 hours ago [-]
What’s wrong with worktrees? To me that is exactly what mounting a branch would be. I use them a fair amount.
Edit: I see people bringing up lazy file checkouts in conjunction with mounting a branch. For some of the enormous repos people work in this makes sense to me.
steve_adams_86 10 hours ago [-]
> messageless commits for intermediate work
Would this be like `git commit --allow-empty-message`?
>the performance of git isn’t a bottleneck for agents.
Eh, it depends on the workflow. Especially if you have certain stack based workflows. Worktrees are kinda half solution here but depending on the repo type and if you are dealing with LFS or sparse checkouts, I've had agents struggle really hard to work through a stack or rebase things without a lot of thrashing or being IO bound by just stumbling into operations in a boneheaded way. Now I have AGENTS.md/skills/hooks gaurdrails littered about to try and work around things.
hnlmorg 7 hours ago [-]
How much of that is due to the git CLI and how much of that is inherent flaws with the git VCS?
I know git (the VCS) can become a bottleneck with massive monorepos at the scale of Linux or Microsoft. But is anyone likely to port them to something new just to be a little more agent friendly? And if the goal of this new VCS was to make life easier for large monorepos (for humans as well as agents) then why doesn’t the author mention that on the project’s website? Because that’s exactly the kind of thing that might make this an easier sell to project teams.
PunchyHamster 11 hours ago [-]
checkout repo into tmpfs
grayhatter 10 hours ago [-]
> Eh, it depends on the workflow. Especially if you have certain stack based workflows.
I would normally assume there's 0 percent chance that `git` (the binary) is a significant impact on LLM based devel. The same applies to git, the protocol/format/tree.
I'd love to hear about what makes the workflow you have, where any part of git becomes a noticable proportion of the process? Unless you mean your LLM just can't figure out how to use git?
mohsen1 11 hours ago [-]
The lazy mount is very interesting. This is similar to how google3 works at Google that I have not seen any similar implementation in open source so far.
Git sparse checkout is helpful but checking files out as they are needed is much more flexible and intuitive.
Microsoft VFS for Git / GVFS is the closest that I can think of.
There is room for this lazy mount idea to be built on top of Git
zdgeier 6 hours ago [-]
I do wonder how far you can make git work like google3. Partially why I'm making Oak is because I think it might be hard to impossible to implement the necessary features for monorepos to work correctly in Git. I don't doubt that it can be done, I do wonder how it will feel though.
kccqzy 5 hours ago [-]
If the goal is just to make it work like google3, then hg and jj and sapling can all already achieve this. There’s no need for a new contender here. The differentiation must come from something else.
But of course at Google the file system part (CitC) is a layer beneath the version control system and is shared across different vcs tools.
zdgeier 4 hours ago [-]
I do think hosting is an important part of the VCS story. I agree that hg and jj and sapling are capable of being front ends to a google3 like backend GitHub like thing to support it (Google has this internally for jj). Of course some people are working on hosting solutions for these but it feels wrong to me that hosting platforms and their underlying VCS are not made by the same team. IMO people like google3 so much because it’s one integrated system which is the approach I’m trying with Oak.
3 hours ago [-]
pnw 11 hours ago [-]
Zach is underselling his achievements here, having previously built the Jamhub VCS which was acquired by a well known founder.
dang 9 hours ago [-]
Hmm - of course I went looking for the past HN discussion and it seems there wasn't one - that's a bummer:
Partially why I got so excited about version control is how well this post blew up when I posted.
dang 7 hours ago [-]
Oh good! Added above.
N_Lens 4 hours ago [-]
This project falls into the classic "More" trap. Agents are fast at creating code so let's make them even faster (more). However any rational observer can see that the bottlenecks for throughput are no longer at this segment of the process.
Human decision-making, communications and awareness are the key bottlenecks, not code generation and commit speed, by several orders of magnitude.
And I think that's a good thing if we want to avoid mass-psychosis.
forty_one 11 hours ago [-]
Looks very interesting, but it's difficult to see the benefit from git right now apart from performance? Don't get me wrong, that's good, but I don't think it's a big enough proposition to get people to ditch git and move to oak.
Since it's early, here a couple of things I'd loooove git to be and it's not, maybe you can consider to go in this direction and, if there are many more like me, get a large user base:
- The private/public quantum shouldn't be a repo but something more fluid within a repo. A public repo should be able to have private sub-directories, files, etc. If should be fluid in this regard, so big projects could open-source <i>some</i> features, not all. Right now it's all or nothing, and that closes the doors to many big closed projects.
- env variables. If you could make its usage easier and more seamless within oak, that could convince many (me included). It's really a headache to deal with env vars and git, and shouldn't be the case.
- Collaboration for agents beyond PRs. I don't know exactly what's the flavor for this, but I know that fundamentally the create PR/merge circle of git is not how it should be.
Great initiative and good luck!
zdgeier 10 hours ago [-]
Totally agree with everything. Definitely will be hard to get people to switch. Also love the monorepo idea you mentioned. It should totally be possible to keep the benefits of a monorepo without the downsides of git submodules. So you should be able to open-source parts of a repo without open sourcing the whole thing. One of the benefits of building from scratch is that this is pretty straightforward. Also the other ideas you mentioned are really awesome. Thanks!
pixlmint 12 hours ago [-]
Did you have your agent talk you into making this something separate over building on top of git?
zdgeier 12 hours ago [-]
Haha I wish, but I've been working on VCS's separate from git for a while now. Although I do love git, I've wondered for years before agents if something could be made using something different, rather than building something on top.
pylotlight 2 hours ago [-]
I think everyones been asking for "what's next" around git for a while as well :P
Theo certainly mentions it plenty.
zdgeier 12 hours ago [-]
I'm also secretly a massive fan of Dr. Hipp and his work on FossilSCM [1]. I love a bunch of his design decisions there and wanted to apply them to a new system.
I get pi.dev AI agents to use Fossil-SCM with a skill. Lots of built-in features like Wiki and Tech Notes. I’m sure there’s lots more to use with agents.
andrewshadura 11 hours ago [-]
D. R. Hipp, not Dr. Hipp.
cornstalks 9 hours ago [-]
He earned a PhD…
manmal 7 hours ago [-]
> but the speed is a consequence of the design, not the pitch.
You kinda lost me there. I‘m supposed to use a central technology whose author can’t be arsed to write a few paragraphs?
sourdecor 14 hours ago [-]
I have always wanted a version control system that was basically Emacs/Vim/Neovim's undo-tree[0] but persistent and social. Why do I have to manually talk to git? You are a computer, track every modification I make while editing and let me decide (or help me decide) on what a checkpoint is.
Seconding Jujutsu! I've been working to add Jujutsu support to basically every open-source tool and framework I use, including the agentic ones [0]. While it doesn't work for everyone, I've found it can really work for some people. (like myself)
It's absolutely great for keeping a bunch of exploratory changes alive, quick prototyping, etc. as I tend to do with basically every source I have on my machine. I don't have to think at all about the stuff I hate about git (babying the index, being careful to amend and etc. right the first time because undos are annoying, etc.)
submodules are cursed. LFS support looks to be coming soon in the form of jj ignoring LFS files and just allowing you to use git-lfs to manage them.
steveklabnik 9 hours ago [-]
Submodules are already that way as well.
y1n0 7 hours ago [-]
Mostly true, but the weird edge case I run into is workspaces. Since they seem to be independent of and not backward compatible with git worktrees, there is no fall back to git for submodules within a workspace.
We still use submodules in a number of places at work so it’s a bit of friction for me. Other than that, I’m rapidly becoming a jj convert.
steveklabnik 5 hours ago [-]
Ahh, so workspaces don't currently support colocation, aka "put a .git directory in there". So that's what's up there. Interesting corner case! I know upstream is working on it.
LoganDark 5 hours ago [-]
Submodules are cursed, but I like to clone my repos without colocation, but then sometimes find that I need to re-colocate in order to `git submodule update`.
Relatedly, when I use filesystem paths as remotes they need to be colocated or else it doesn't work, which is a little annoying!
is there some more information about DeltaDB? this seems to be an early access feature and not something available at the moment but I would be interested to learn more about it.
I would recommend just linking to a few sentences that say how Oak is different than Git, rather than a personal backstory. (https://oak.space/docs)
My initial reaction is if this is not something than could be built on top of Git, rather than replacing it. Describe the data model - what is a "commit", what is a "branch" ..., if the same as git, then why not reuse.
(The submitted title was "Git is forever. I'm building Oak anyways." and the submitted URL was https://oak.space/blog.)
zdgeier 12 hours ago [-]
Awesome, thanks! You could also change it to the homepage if you'd think that would work better for people. https://oak.space
dang 12 hours ago [-]
Ah thanks - I forgot to change the top link. In this case I think https://oak.space/oak/oak is probably more enticing to the community because they get to see what it actually looks like - so I've used that URL instead.
zdgeier 12 hours ago [-]
Sick, looks great!
CHUNK_CHUNK 4 hours ago [-]
Interesting to see more dev tools designed specifically for agent
workflows. I've been building a local tool that monitors AI agent
API calls and costs in real time — the "I have no idea what my
agent just did" problem feels like it's only going to get bigger
as agentic coding tools spread.
no_circuit 11 hours ago [-]
My impression with this space is that you'd need to fundraise startup-style, which I'm assuming you'll do, to catch up with everyone that is doing a similar thing.
The problem space and solution has been around for a while in big tech, and now there is a handful known products publicly known, and probably a couple dozen still secret ones. It is just now with AI/agents volume, there probably needs to be an easier solution for quick narrowly focused VCS views.
For filesystem mount, usually FUSE-FS, of a version control system to enable multiple viewers without transferring a lot of data see some current/previous implementations:
- Google: Piper via CitC (Clients in the Cloud) often used with Cider (web IDE)
- Meta: Sapling on EdenFS (from what I read, never worked there)
The main issue I see is with the site -- it just seems like a big blob of AI-generated text I need to understand what is going on. The cool part wasn't even shown off: your GitHub UI clone that you can get to from seeing the benchmark code.
FYI, I also think the 4-way arrows logo has been used before, and still might be in use. I tried searching, but I think I saw a multi-colored one, maybe in a UK-based IT corporate training company's class I attended.
zdgeier 9 hours ago [-]
Definitely agree and great points. This is going to be a very busy space in the next year. haha I've never used ClearCase but my friend told me about mounting VOBs in ClearCase so I'm excited it bring it back. Thanks for your thoughts!
danpalmer 5 hours ago [-]
The focus on speed is interesting here, it's something I think less and less about with agents. I'm not waiting for git operations, and at the rate agents run it's just not a major factor. Agentic development is all about throughput, not latency.
This looks interesting regardless, but I do wonder if the latency focus is the wrong way to sell this.
drob518 4 hours ago [-]
Oh, it’s “for agents,” you say. Well, I’m sure that makes all the difference.
andunie 2 hours ago [-]
Wait, I clicked and read, but I still don't know what is this about!
zdgeier 2 hours ago [-]
Happy to describe or answer questions here. It's a version control system (like Git/GitHub) designed for agents. Some specific things that make it different than git are a simplified branching system and networked mounts so you can work on tasks on a repo in parallel.
CrzyLngPwd 12 hours ago [-]
Back in the day, 2020, the effort to create a program/website/service was the prohibiting factor, which meant the sediment remained at the bottom of the barrel where it belonged.
Now, every brain fart is published as a finished product no one wanted.
coldstartops 11 hours ago [-]
I am curious, how do you handle latency issues for on demand access? I saw you use FUSE (and FSKit), and from my experience it is pain to make filesystems in userspace work on-demand over WAN because a) latency, and variable RTT; and b) you can't saturate the wire and aggressively read ahead things, otherwise native apps will freeze, lag, or just make the UX unpleasant, especially if there are too many placeholder files, or large files with random jumps in them.
zdgeier 11 hours ago [-]
I think what makes FUSE/FSKit great here is that agents usually only need to see the file metadata + read a handful of actual files, rather than some applications needing read many things. If you're doing huge rewrites, this is a problem, but most tasks are usually somewhat small. Definitely is a problem that I've ran into though, we do cache aggressively to try to solve some of this, but it'll never be as fast as reading/writing directly to disk. We have benchmarks [1] if you want to take a look at how we're testing some of the performance there.
Couldn't run your benchmarks, as I did not create an account, and a bit of a different beast that I am comparing against (P2P distributed filesystem), but these are my numbers and setup, and the on-demand part lines up with what I have observed:
Setup: a Linux box on the other side of Romania (compared to where I am living) reading from a Windows box in Singapore (~200 ms RTT)
- reading 1 MiB of a 1 GB remote file pulls only 16 MiB (~98% avoided) - this is because of my fine tuning optimization choice
- first byte approx: 2.3s
- git-LFS repos also clone cold over the mount byte-perfect (separate Mac - Linux run on a ~20 ms RTT)
The thing that I do differently is that my metadata is eagerly pushed, as I optimized for content streaming.
And 100k-file tree mounts I did not test yet.
But my goal was to have instant file access for generic files between apps, and peer to peer, supports also Windows :D
That unfortunately doesn’t match my experience at all. My Claude often runs rg in the repo attempting to find things that need to be changed. And of course Claude still needs to invoke the build tool to ensure the change can be compiled, which necessarily involves reading almost every single file at least for a fresh checkout? Or did you envision the build tool being completely remote?
bwhiting2356 5 hours ago [-]
> without needing to download everything
how does your agent run tests or click around the UI to verify changes if it doesn't have the full code?
zdgeier 5 hours ago [-]
Good question, the files are grabbed on demand from the server so the agent can fetch everything it needs to run tests, a dev server, or anything it does normally. Now this might be slightly slower in some cases where the history is short, but the bigger the repo and files the more this makes sense. So the full code is available and buildable, just over the network instead of locally.
Another thing, inside these mounts build artifacts and directories like node_modules can act kinda weird, so we just have some extra context in the AGENTS.md to host these in a different location from the mount. or agents usually figure this out on their own in my experience.
desmondl 10 hours ago [-]
I would have loved to try locally, but the installer asks me to create an account and I don't want to do that yet.
zdgeier 10 hours ago [-]
Agree we should definitely support offline flow. You can download the binary manually if you would like [1]. Although the offline/self-hosted flow isn't fully tested right now, would love some feedback on it if you're able to check it out!
> This repo was written almost entirely using AI with human oversight. If you see anything that needs fixed or would like to contribute, please email ... or reach out on Discord
Why not just provide an email address that's delivered directly to the agents you have developing Oak?
I didn't delve into the benchmark repo to understand what your loop is measuring. Why would an agent (without fine tuning or oak-specific context) be faster with oak than it is with git or jj?
fellowmartian 8 hours ago [-]
“Why not just provide an email address that's delivered directly to the agents you have developing Oak?”
Genuinely can’t tell if you’re joking or not.
zdgeier 7 hours ago [-]
> Why would an agent (without fine tuning or oak-specific context) be faster with oak than it is with git or jj?
A large part comes from mounts. Being able to use FSKit/FUSE to make a change to a repo rather than doing a partial/full clone. A smaller part comes from having optimized context (json output) that agents are able to parse better with less tokens.
Pet_Ant 14 hours ago [-]
What I want from a version system is to capture event in history not like changes as a files but as events that capture a process.
If I split a file in two I still want to be able to see blame correctly for the author of the function, not one file as freshly created and the other with a bunch of deletes. I wish commits could be folded into larger commits so that you can still capture the individual changes but also not see them by default when looking at the history of a file.
Just a more human centric perspective on change history where it captures the way we talk and think about changes.
WolfeReader 14 hours ago [-]
"I wish commits could be folded into larger commits so that you can still capture the individual changes but also not see them by default when looking at the history of a file."
Fossil merges do this. More people need to use Fossil; it's got a ton of great ideas.
"If I split a file in two I still want to be able to see blame correctly for the author of the function, not one file as freshly created and the other with a bunch of deletes."
Now this is a good idea that I've never seen in a VCS.
packetlost 13 hours ago [-]
> "If I split a file in two I still want to be able to see blame correctly for the author of the function, not one file as freshly created and the other with a bunch of deletes."
>
> Now this is a good idea that I've never seen in a VCS.
There's a reason no one has done that, the VCS would have to have a semantic understanding of what it's tracking. I'm sure that's possible, but I think would see extremely limited success. Honestly, it may have even been done for proprietary languages and VCS systems that have since faded into obscurity.
I'd settle for searching the git history for a particular regex/string and then running a blame on that.
Pet_Ant 13 hours ago [-]
1) An “easy” way to implement this would be to treat the original file as the parent to both files. You can add a new command “split” if needed to mark the new file as a fork of the existing file.
2) language sensitive version control seems like the next thing. We need like an LSP for VCSes.
mamcx 12 hours ago [-]
The other way is to make the tool UX do the semantic, ie:
`git split`
Something that I enjoy with jujutsu is that the semantics is the tool itself. ONCE you do that, the rest become easier!
tlb 13 hours ago [-]
git actually does this. `git diff --find-copies`
Pet_Ant 13 hours ago [-]
If I run blame on the new file the will I see the commits made by the original writers? Will it find the same code if it was written independently? It’s not about find copies it about recording changes to a code base as an artifact and not to files. The closest git has is limited rename support.
tlb 12 hours ago [-]
Yes, if you run `git blame -C`.
WolfeReader 2 hours ago [-]
I legit did not know this. Thanks!
uproarchat 5 hours ago [-]
Wow that interface tingles something inside my brain.
zdgeier 5 hours ago [-]
A good tingle or bad tingle? haha
uproarchat 4 hours ago [-]
real good
achandlerwhite 14 hours ago [-]
Grammar nitpick: "anyways" should almost awlays be "anyway"
Many things were forever until they suddenly died, but I think this is especially true for git.
I'm not saying this as a git hater, quite to the contrary. I think git is great. I also think git is an ill-fit for the majority of modern commercial software projects and there will be a breaking point where companies realize that and move on.
Banditoz 14 hours ago [-]
What is git not suited for in modern development? I haven't found any reasons.
jayd16 13 hours ago [-]
Git is great but if you really haven't found any reasons then you haven't looked at all. From large files to sub modules to hook permissions and file permissions... The list goes on and on about what where git falls short.
There's plenty of workarounds too, but that's what they are. Workarounds.
gchamonlive 13 hours ago [-]
Do you know if Jujutsu addresses these issues?
hn92726819 12 hours ago [-]
jj does not have large file or submodule support, but it does intend to in the future (you can read their design docs). Right now it's git compatible, so I'm not sure how 'permissions' would be stored compatibly, or what that means. I'm guessing ownership and xattrs
steveklabnik 9 hours ago [-]
With the git backend, jj inherits git's problems. So right now, it does not, at least directly.
With other backends, it inherits their problems. But also their solutions :) So with those backends, it could!
13 hours ago [-]
dgellow 12 hours ago [-]
Game development, with very large assets. Also, git is pretty terrible with non-text files.
driggs 11 hours ago [-]
Seconding this for geospatial dev projects, which may have absolutely massive binary data files.
Exoristos 12 hours ago [-]
You're diffing very large assets?
dgellow 9 hours ago [-]
Is that surprising? It’s pretty common in anything game dev related, that’s why perforce is still in use, despite its horrible UX
13 hours ago [-]
fusslo 13 hours ago [-]
1. rewriting history
2. rebase based merge strategies - our team has 50+ devs across three continents merging into monorepo with teams maintaining submodules. By the time your merge request passes CI it has to be rebased. People are literally holding off on reviewing merge requests to make sure their own changes get in first
3. permissions for subdirectories/assets. some necessary code/modules are highly regulated and company secrets. Git cant lock certain directories based on who clones the repo
4. Agentic coding - if you don't commit then your changeset after each request is lost. JJ solves this. You could just say to commit after every request then squash the commits. But, I think this is an ergonomic argument
5. Maybe it's just my experience, but git-lfs is pretty annoying to manage on large teams and changing files to/from lfs. often easier to just delete and clone again
6. git blame on non-meaninful changes. Running a code linter to add/remove whitespace makes git blame return who ran the linter rather than who wrote the code
7. self-reported identity. every time we get new laptops (because they buy the cheapest POS) devs forget what they set for 'username'. so it ends up being 3-4 different identities with the same email
Those are just my complaints lately
ef2k 12 hours ago [-]
to be fair, #2 exists because monorepos and submodules are somewhat antithetical concepts. A monorepo is supposed tobe the single source of truth for the codebase, while submodules are pointers to external repos with their own history. That alone will increase the source of churn for teams that are constantly merging.
skydhash 12 hours ago [-]
1. git rebase and last commit amending.
2. That has the smells of a wrong code architecture. If change request leads to unneeded code conflicts, you need to rework your code architecture.
3. That’s valid, but why not create libraries out of those modules?
4. Valid. But I think the issue is on the agent side. Git has already all the features to make those happen, it’s the agent that is not integrated with git.
5…
6. Either than sweeping changes (adding a formatter, changing config,…) There’s no need for formatting changes to be its own commit in the main repo. I usually add a check to prevent inconsistent formatting.
7. The git history has the previous username and email recorded alongside each commit.
WolfeReader 13 hours ago [-]
1. Ease of use. Other VCS have more consistent command line interfaces; Git's interface has to be studied. In practice, people end up using GUIs with missing functionality and then end up searching for help, and a lot of real experts come to rely on powerful wrappers like Magit, LazyGit, or JJ.
(Compare to Mercurial, Fossil or Git; those systems have consistent and usable interfaces. There's much less demand for wrappers or LLM tooling since they're easy to use already.)
2. Preservation of history. Two common commands - git rebate and git push -f - cause commit history to be lost, sometimes permanently. ("Just be careful" and "Just don't use those commands" are useful pieces of advice for an individual, and virtually impossible to enforce over groups.)
3. Conflict resolution. Git forces the user to resolve conflicts ASAP so we often lose information about A. What the conflict exactly was, and B. How the individual resolved it. Most VCS have this issue; JJ allows you to commit the conflict and solve it in a separate commit, which is nice.
skydhash 11 hours ago [-]
1. Usually people have no mental model about versioning other than “draft-1, draft-final, final, final2, final-final,…’. Because they don’t care about requirements and design decisions documentation, auditing changes, and release management. Git provides a set of tools for solving those. Wrappers are for when you have your own workflow for those needs and have a good understanding of git.
3. The goal of writing code is to have working software. Conflict messages are like compiler warnings, better have them than getting errors slipping by unnoticed. If A conflict with B, the root cause is often a design conflict, which means that the design of the software is inconsistent.
The conflict only matters as long as it’s not been solved. For each commit, the design of the software need to be consistent, and the succession of commit describe the evolution of the design. A is not lost, B is not lost in the case of a merge and may stay for a long time when rebasing. C which solves the difference between A and B (and may replace B) is also consistent. I don’t care about inconsistency.
How’s it an ill fit? Outside of large monorepo things, which are not the majority of modern commercial software projects, the main complaint I hear is the learning curve. But LLMs should be addressing that fairly well.
PunchyHamster 11 hours ago [-]
"Majority" is massive stretch. There are 2 main pain points:
* monorepo megarepos - but you kinda need system built from scratch that sacrifices a lot in other places to handle that in the first place
* media asset heavy repositories - again, different paradigm. Stuff that make Git great like full local history just become impossible to do sensibly when amount of changes per day is hundreds of megabytes.
Most projects don't git that. And for majority git + LFS is enough.
hanneshdc 9 hours ago [-]
I actually like this.
I thought I wouldn’t because it’s just another git - but git worktrees are a PITA.
Can I suggest though to focus the readme on the lighting fast checkout for multi agent loads? That seems to be the big selling point and is the real win over git.
I think other commenters here are missing the point - it’s not “for agents” in that the API is somehow agent friendly. Of course git being omnipresent in the training data gives it a one-up. It’s “for agents” in that it aligns with a multi-checkout workflow better than git does.
zdgeier 8 hours ago [-]
Great point, the readme for this repo is not great right now and we have a bunch to improve on that people have pointed out. Thanks!
blurbleblurble 11 hours ago [-]
For concurrency reasons Pijul is an excellent git replacement for agents in my experience.
vova_hn2 13 hours ago [-]
I cannot imagine git being a performance bottleneck in agentic workflow.
> You can work on many tasks in parallel without needing to download everything or fight worktrees.
What does "download everything" even mean? Why would you "fight worktrees"?
zdgeier 8 hours ago [-]
"download everything" means that you don't have to do a full or partial clone of the repo to make a change. You can imagine agents running in the cloud need to spin up and access the repo for every task, this can be slow for large repos (also small repos with large files). Also locally, worktrees can be a pain to manage with conflicts, confusing branching, and you can't check out the same branch multiple times. But I do agree that we're probably still pretty early on in the agentic adoption that many users using agents in git-like ways will not see much performance improvement.
ilc 6 hours ago [-]
I could, I hit it in some situations. But I'm running some fair size repos where I do multi-worktree work.
That said, it hasn't been enough of an issue for me to want to fix.
robby1110 12 hours ago [-]
Certainly an interesting project although I am wondering what makes the benefits mentioned agent specific? You have mentioned performance improvements which is great but in that case would it not just be a better vc than git in general? what perks only work with agents that wouldn't work with individuals?
zdgeier 11 hours ago [-]
Thank you! The bet we're making is that agents will need to work on tasks much faster and with more parallelism than humans do with Git right now so these performance metrics will matter more. Git is awesome for human work so I'm not sure these metrics really matter than much to people. But also for agents, worktrees are not the best (can't work on the same branch in multiple places) and also the speed at which branches/PRs are created, need to be merged, etc. will need to be way faster and simpler since keeping up-to-date with how fast things are being modified is really important.
insane_dreamer 2 hours ago [-]
anything that improves on git's branches vs worktrees mess is a GoodThing, imo
IshKebab 14 hours ago [-]
Does this try to solve the biggest problems with Git: submodules and LFS?
zdgeier 14 hours ago [-]
Planning on some monorepo features soon that should solve some submodule problems but haven't approached yet. I have some new ideas here. And yes, no separate LFS system!
IshKebab 14 hours ago [-]
> And yes, no separate LFS system!
Awesome. How does one decide which files should be stored externally, and manage that? And where is that decision stored?
zdgeier 14 hours ago [-]
I'm a little confused by this but I assume you're talking about marking files for LFS (.gitattributes)? For us, we chunk every file (even if it's a single chunk) so every file is stored in the same way -- it's just data to us. But let me know if I got your question wrong.
IshKebab 11 hours ago [-]
So the problem that LFS solves is that sometimes you have large files that you a) don't want to download by default, e.g. old binary executables, b) are big enough that you want to serve them from a more efficient source e.g. S3, or c) are large and not needed forever, so you want to be able to delete them one day.
LFS "solves" those but it does it really badly. Really really badly. I've probably forgotten all the ways, but at least:
1. It conflates the content with the storage mechanism. You can't change retrospectively how the files are stored, even though the only thing you really need to be immutable is their content.
2. It requires you to actively set up git-lfs, otherwise it silently does the wrong thing.
3. Not exactly LFS's fault but I have yet to find a forge (GitHub, Gitlab etc.) which exposes the LFS stored files in a sane way. Last time I tried it was basically impossible to delete old files, and you needed a lot of extra work to even enable LFS in the place.
philipwhiuk 13 hours ago [-]
This sounds similar to Epic Games' Lore approach - have you seen what they're doing?
Yes Lore looks awesome! I was previously working on a VCS for gamedevs -- that space definitely needs something better than Perforce haha. My comment would be that I think you need to nail a new GitHub + Git to really be successful in the space and I'm not sure Epic is focused on this. I wonder if Lore will turn out similar to Unity Version Control which is very specialized to Unity workflows or something more general. Definitely love what they're doing though.
agalamli 12 hours ago [-]
seems like an interesting idea. the only friction would be to get people to use it instead of git, however i believe it will happen slowly, more people trying it and recommending it to others.
manquer 12 hours ago [-]
There are git clients for perforce, hg, svn and so on[1]. Oak' developer (or the community) could always develop a git frontend for users who prefer that .
Totally agree. This is going to be the hardest dev tool to get people to switch, but we're trying! I think there's going to be many more players in this space in the next year or so. It'll be interesting to see what shakes out.
vcryan 9 hours ago [-]
I think git it the perfect system to bring order AI coding. Generate code however you like, but ultimately, it goes through a sensible, proven, and auditable pipeline. As someone also building AI tools, git it something I find myself "building with" - not against.
whalesalad 10 hours ago [-]
Could this not have been a git wrapper? I struggle to see why git had to be abandoned entirely. Now you need to rebuild the world just to add some ergonomics on top of git.
zdgeier 8 hours ago [-]
Could probably build a Git backend at some point for people to use kind of like how jj does it. Right now my goal with this is to see what's possible natively if we don't try to build on top of Git. But definitely agree that the wording needs to be improved here to explain what benefits we get from not building on git (better mounts, built-in LFS, clearer branching/workflows, etc.).
croes 10 hours ago [-]
Given the promise of AI it doesn’t need special treatment to handle our data we change a lot to please the needs of AI.
Like replacing streets with rails while we claim FSD works out of the box
teaearlgraycold 10 hours ago [-]
You're jerking yourself off with the web UI here. What I see on first load is essentially completely homogeneous, just a sea of black boxes with white text. As a purely aesthetic composition it's interesting to look at from afar. But in the context of presenting it to strangers it's hard to approach. Too much going on and hard to parse.
mrmrs 9 hours ago [-]
Not sure about the accuracy of the first sentence but agree we have a lot of work to do on the UI and narrative front! Most of my focus as a designer has been on the CLI itself. I use oak to develop oak and want the experience for agents and humans to be as smooth and wonderful as possible. The landing page is not up to our own standards yet but we are iterating every day to try and improve it. Hope to make the content more approachable for you and others soon. Really appreciate the feedback.
jazz9k 12 hours ago [-]
It's kind of like replacing Wordpress. Sure, you can make a better alternative. But replacing an entrenched player that has been there for at least a decade will be almost impossible.
dgellow 12 hours ago [-]
you don't need to replace, you just need to find your niche, then expand from there
13 hours ago [-]
nixosbestos 13 hours ago [-]
Yeah, I'll just wait for jj to get more virtualized FS features, and be very, very happy with that.
zdgeier 11 hours ago [-]
I love jj! Martin is really awesome and love his work. I know they're using a VFS backing inside Google for their monorepo, wonder if we'll get some of those cool features on the outside.
steveklabnik 9 hours ago [-]
There is a Google intern project to work on an open source vfs thing, we'll see how it goes.
GroksBarnacles 12 hours ago [-]
I wish we as a society would stop using random words for products. "Slacked about Oak, but they need in Fizzle. The deck's in Slate, the assets are in Vault, the timeline's in Pulse, the copy's in Quill, the build's in Forge, and the launch party's already in Ember."
jayknight 10 hours ago [-]
Aren't all product names "random" words before they become well known? Some are more descriptive than others, but they're all arbitrary to begin with.
gitpusher 11 hours ago [-]
What alternative would you suggest? "Sent a message in our enterprise chat application about the version control system that our engineering team sometimes uses (no, not that one, the other one) but they need it in the project management app that our Design team uses (good luck requesting access from IT, we have more than a dozen project management tools in our catalog)"
ianburrell 10 hours ago [-]
The solution is to always add a second name. Like people, projects could have short name and full names. Could be organization (Apache Spark), could use type (Oak Versioncontrol), could be longer name (Violet Spring). Then can use short name internally, and full name globally. I like the type name cause tells what it is.
russellthehippo 8 hours ago [-]
"Welcome to the agentic substrate"
YAAS (yet another agentic substrate)
jing09928 4 hours ago [-]
[flagged]
sarracin0 8 hours ago [-]
[flagged]
Nicolau_Amorim 9 hours ago [-]
[flagged]
zane_shu 8 hours ago [-]
[flagged]
codymisc 13 hours ago [-]
[flagged]
noelwelsh 14 hours ago [-]
A few comments:
* The core idea sounds interesting. Make it the first paragraph, not paragraph seven.
* Spend more words describing what makes Oak different.
* "I built a version control system in my free-time called Jam". You probably didn't name your free time. "I built a version control system, called Jam, in my free time."
philipwhiuk 13 hours ago [-]
"I built a version control system, in my free-time, called Jam" is fine.
AdamN 13 hours ago [-]
Just "I built a version control system called Jam". The free-time thing is good for a history page but the homepage needs to tell the important part (you've got history and expertise in this subject) and then move onto what the vision is for Oak and what kind of help you need.
stonogo 13 hours ago [-]
It's also fine without the commas, because nobody was confused by that structure.
sublinear 14 hours ago [-]
Lots of self-promotion, but no concrete comparisons where this tool does a better job than git.
The only thing to go on is this single sentence: "With virtual mounts, agents locally and in the cloud no longer need a full copy of a repo to get working."
> For the first 100 users that subscribe to a paid plan I will send you a personalized e-ink display
I don't understand anyone who feels incentivized by this. Brogrammer 2.0 is weird.
zdgeier 14 hours ago [-]
Check out the homepage! https://oak.space might have what you're looking for. I can answer any questions you have here as well.
dang 12 hours ago [-]
Since this project hasn't appeared on HN before and is obviously of interest to people, I've taken the liberty of turning your post into a Show HN (which is the convention for sharing your work on Hacker News - https://news.ycombinator.com/showhn.html). I used the text from your blog post that explains what the project actually is - this is the bit we need to lead with, to stanch all the "I can't tell what this is" comments.
I hope this is ok! If you prefer different text at the top, let us know at hn@ycombinator.com.
manwithopinions 14 hours ago [-]
The blog post is a terrible intro, the website is much more insightful: https://oak.space/
I found the section titled “Local feature branches.
Server main. One squash.” most interesting.
Models know git because there's a monstrous amount of git in their training data. Models never heard of a new thing "for agents", so you have to teach them to use it via skills and docs. Models can, of course, follow documentation, so there's nothing stopping them from using the new thing...but, the new thing "for agents" starts the race well behind the known thing that was built for humans a decade or two ago and has huge amounts of training data baked into every model.
I'm not saying nobody should make new things (an accusation I've gotten when saying something similar about a previous "for agents" thing), of course people should make new things. I'm saying that when I see "for agents", I think, "prove it". Agents don't have trouble with git, so there's gotta be some kind of pain point about using git with agents that I'm unaware of that this solves somehow (but isn't expressed on the page) or this isn't actually for agents, it's just a project someone wanted to do (and that's also fine!). But, if the latter, "for agents" is merely marketing and I'm not interested.
When I say "benchmark the query engine using the foobar dataset and compare it to run 431", the agents go and run my special benchmark tool and use the different subcommands to compare results and so on.
I'm sure a new VCS would be a little less smooth sailing, but not by much.
There are work arounds though and I am creating what I call knowledge triggers for Pi that are similar to claude's "PreToolUse" so having the agent use oak all the time is not an issue in my opinion.
The challenge for oak is why? Considering how I actually want to slow agents down so I can ensure it is doing the right thing and because the massive bottle kneck is the LLM themselves, speed when measured in milliseconds or even seconds will not concern many.
I thought oak was more of, we know how to prompt inject context based on code that is stored in oak for example, but faster operations can help, but the use case is limited. The missing piece for better/correct code is context at the right time.
There's a limit of how many simultaneous instructions an agent can follow (the exact number depends on the specific model so instructions that are fine for one model may overwhelm another). If this keeps happening, consider trimming your instructions or even better, solving it at the harness level (like intercepting and rewriting ripgrep calls to use your thing, like rtk [0] does in agents that supports this)
Overall, never leave to an agent an instruction that must be followed at all times. For example, doing things in a git hook beats a multi-command workflow every time the agent commit, etc.
Is this state of things forever? I don't think so. Very soon models will become so better this will be a non-problem
[0] https://github.com/rtk-ai/rtk
Another option: when model invokes standard tool, rewrite the invocation to newfangled tool.
Bunch of ways of doing it:
(a) Invocation of standard tool returns error saying to use newfangled tool instead
(b) Invocation of standard tool returns message saying it has been dynamically rewritten to invoke newfangled tool, followed by newfangled tool output
(c) Invocation of standard tool in context is dynamically rewritten to invocation of newfangled tool, prior to execution
In case (c), the model ends up thinking it somehow knew about this new thing all along, even though it actually didn’t
Regardless, in any of these cases, the implementation for any of these above options may be vastly superior to the “naive” implementation for agents — but then the parent comment here is right that an engineer would need to justify their implementation to users, not just make a loud conjecture. It’s a non-trivial claim to say that a bespoke solution not present in tool-use training and accounting for context-rot would result in a better performing model. Moreover, justifying an agent-specific efficiency gain that humans wouldn’t benefit from makes the claim even more non-trivial. Using Sagan’s razor, it’s then reasonable for people to ask for a comparably non-trivial amount of evidence.
The thing we’re trying to optimize is not whether an agent can remember the command. It’s the runtime shape of agent-driven development.
When an agent drives a VCS through a captured terminal, things that are tolerable for humans become direct costs: clone/setup time, worktree setup, full status output, huge diffs, branch cleanup, interactive prompts, shared-checkout mutation, repeated preflight checks. Those costs show up as wall time, bytes over the wire, transcript tokens, and recovery steps.
So the Oak bet is narrower than “agents can’t use git.” They can. The bet is that if you assume branch-per-agent workflows, lots of parallel sandboxes, large repos, and non-interactive command execution, the VCS interface should have different defaults if you want to optimize for shipping speed and efficiency of token usage. If you're already going fast enough and not running out of tokens - then using oak seems pretty silly.
People do not need to ditch git to try Oak out. One workflow we care about is letting agents work in Oak where the agent-specific costs matter, then exporting back to git for the human review, CI, release, or compliance workflows.
Totally agree this should be provable and benchmarked. The homepage has Oak vs Git numbers because we do not want “for agents” to just be vibes. We’re measuring transcript bytes, estimated tokens, tool calls, wall time, large diff/status behavior, and contention in agent-style workflows. We’re also working on the benchmarks repo in the open: https://oak.space/oak/benchmarks
The exciting part to me is that we can already improve on tokens and timing despite starting with the model-prior deficit you’re describing. If we can win on measured agent workflows while git still has the advantage of being deeply baked into the models, I’m incredibly bullish on where Oak can get to as the tool and the ecosystem matures.
Longer term, if Oak proves useful and sticks around, future frontier models will likely have more Oak examples in training data, which lowers the upfront learning tax for an extra boost.
The tool is called gitnow. it is honestly quite simple, just create a project, add the repositories you want and get to building. I've found having another claude chat or whatever use the tool to great success coupled with zellij, but could also be zed, tmux or whatever.
Secondly it also pretty much solves the problem of the agent dumping memory files everywhere, they now basically have a scratch space that is theirs, where they can keep their tasks, and just update the repositories as needed.
Use gn the shell after eval if you use it, it will actually invoke cd, instead of creating a subshell.
https://github.com/kjuulh/gitnow
There’s some mention about performance, which is great, but the performance of git isn’t a bottleneck for agents.
There’s some mention about token use being reduced, which is great, but how have they achieved that vs gits porcelain modes. And why does token count require a whole new VCS, and thus incompatibilities with all the established git ecosystems?
I really want to find reasons to like this but it’s probably some of the worst product marketing I’ve seen. And something this significant really does need to sell itself hard if you’re going to get enough people in a project team to agree to switch away from git
But why? Why would I want to like a project which seems to invent problems rather than solve any? I don’t want to like this.
Edit: I see people bringing up lazy file checkouts in conjunction with mounting a branch. For some of the enormous repos people work in this makes sense to me.
Would this be like `git commit --allow-empty-message`?
Eh, it depends on the workflow. Especially if you have certain stack based workflows. Worktrees are kinda half solution here but depending on the repo type and if you are dealing with LFS or sparse checkouts, I've had agents struggle really hard to work through a stack or rebase things without a lot of thrashing or being IO bound by just stumbling into operations in a boneheaded way. Now I have AGENTS.md/skills/hooks gaurdrails littered about to try and work around things.
I know git (the VCS) can become a bottleneck with massive monorepos at the scale of Linux or Microsoft. But is anyone likely to port them to something new just to be a little more agent friendly? And if the goal of this new VCS was to make life easier for large monorepos (for humans as well as agents) then why doesn’t the author mention that on the project’s website? Because that’s exactly the kind of thing that might make this an easier sell to project teams.
I would normally assume there's 0 percent chance that `git` (the binary) is a significant impact on LLM based devel. The same applies to git, the protocol/format/tree.
I'd love to hear about what makes the workflow you have, where any part of git becomes a noticable proportion of the process? Unless you mean your LLM just can't figure out how to use git?
Git sparse checkout is helpful but checking files out as they are needed is much more flexible and intuitive.
Microsoft VFS for Git / GVFS is the closest that I can think of.
There is room for this lazy mount idea to be built on top of Git
But of course at Google the file system part (CitC) is a layer beneath the version control system and is shared across different vcs tools.
Show HN: Open-source version control for game developers - https://news.ycombinator.com/item?id=36485377 - June 2023 (0 comments)
(Hopefully we're making up for it with this one)
Edit: ah here we go: Show HN: A version control system based on rsync - https://news.ycombinator.com/item?id=34439461 - Jan 2023 (118 comments)
Partially why I got so excited about version control is how well this post blew up when I posted.
Human decision-making, communications and awareness are the key bottlenecks, not code generation and commit speed, by several orders of magnitude.
And I think that's a good thing if we want to avoid mass-psychosis.
Since it's early, here a couple of things I'd loooove git to be and it's not, maybe you can consider to go in this direction and, if there are many more like me, get a large user base: - The private/public quantum shouldn't be a repo but something more fluid within a repo. A public repo should be able to have private sub-directories, files, etc. If should be fluid in this regard, so big projects could open-source <i>some</i> features, not all. Right now it's all or nothing, and that closes the doors to many big closed projects. - env variables. If you could make its usage easier and more seamless within oak, that could convince many (me included). It's really a headache to deal with env vars and git, and shouldn't be the case. - Collaboration for agents beyond PRs. I don't know exactly what's the flavor for this, but I know that fundamentally the create PR/merge circle of git is not how it should be.
Great initiative and good luck!
[1] https://fossil-scm.org
You kinda lost me there. I‘m supposed to use a central technology whose author can’t be arsed to write a few paragraphs?
[0]: https://i.sstatic.net/4vbd9.png
It's absolutely great for keeping a bunch of exploratory changes alive, quick prototyping, etc. as I tend to do with basically every source I have on my machine. I don't have to think at all about the stuff I hate about git (babying the index, being careful to amend and etc. right the first time because undos are annoying, etc.)
Does not support LFS or submodules though.
[0]: https://github.com/LoganDark/get-shit-done/tree/jj-vcs
We still use submodules in a number of places at work so it’s a bit of friction for me. Other than that, I’m rapidly becoming a jj convert.
Relatedly, when I use filesystem paths as remotes they need to be colocated or else it doesn't work, which is a little annoying!
https://zed.dev/deltadb
Edit: this was actually announced at a very recent blog post (11 july 2026 so just 11 days ago): https://zed.dev/blog/introducing-deltadb
The blogpost also has some more relevant information as well.
My initial reaction is if this is not something than could be built on top of Git, rather than replacing it. Describe the data model - what is a "commit", what is a "branch" ..., if the same as git, then why not reuse.
(The submitted title was "Git is forever. I'm building Oak anyways." and the submitted URL was https://oak.space/blog.)
The problem space and solution has been around for a while in big tech, and now there is a handful known products publicly known, and probably a couple dozen still secret ones. It is just now with AI/agents volume, there probably needs to be an easier solution for quick narrowly focused VCS views.
For filesystem mount, usually FUSE-FS, of a version control system to enable multiple viewers without transferring a lot of data see some current/previous implementations:
- Google: Piper via CitC (Clients in the Cloud) often used with Cider (web IDE)
- Meta: Sapling on EdenFS (from what I read, never worked there)
- Rational Clearcase, anyone else remember mounting VOBs?
The main issue I see is with the site -- it just seems like a big blob of AI-generated text I need to understand what is going on. The cool part wasn't even shown off: your GitHub UI clone that you can get to from seeing the benchmark code.
FYI, I also think the 4-way arrows logo has been used before, and still might be in use. I tried searching, but I think I saw a multi-colored one, maybe in a UK-based IT corporate training company's class I attended.
This looks interesting regardless, but I do wonder if the latency focus is the wrong way to sell this.
Now, every brain fart is published as a finished product no one wanted.
[1] https://oak.space/oak/benchmarks
Setup: a Linux box on the other side of Romania (compared to where I am living) reading from a Windows box in Singapore (~200 ms RTT)
- reading 1 MiB of a 1 GB remote file pulls only 16 MiB (~98% avoided) - this is because of my fine tuning optimization choice - first byte approx: 2.3s - git-LFS repos also clone cold over the mount byte-perfect (separate Mac - Linux run on a ~20 ms RTT)
The thing that I do differently is that my metadata is eagerly pushed, as I optimized for content streaming.
And 100k-file tree mounts I did not test yet.
But my goal was to have instant file access for generic files between apps, and peer to peer, supports also Windows :D
here is the tool: https://github.com/KeibiSoft/KeibiDrop
how does your agent run tests or click around the UI to verify changes if it doesn't have the full code?
Another thing, inside these mounts build artifacts and directories like node_modules can act kinda weird, so we just have some extra context in the AGENTS.md to host these in a different location from the mount. or agents usually figure this out on their own in my experience.
[1] https://github.com/oakdotspace/oak/releases
From hacker news guidelines https://news.ycombinator.com/newsguidelines.html
> This repo was written almost entirely using AI with human oversight. If you see anything that needs fixed or would like to contribute, please email ... or reach out on Discord
Why not just provide an email address that's delivered directly to the agents you have developing Oak?
I didn't delve into the benchmark repo to understand what your loop is measuring. Why would an agent (without fine tuning or oak-specific context) be faster with oak than it is with git or jj?
A large part comes from mounts. Being able to use FSKit/FUSE to make a change to a repo rather than doing a partial/full clone. A smaller part comes from having optimized context (json output) that agents are able to parse better with less tokens.
If I split a file in two I still want to be able to see blame correctly for the author of the function, not one file as freshly created and the other with a bunch of deletes. I wish commits could be folded into larger commits so that you can still capture the individual changes but also not see them by default when looking at the history of a file.
Just a more human centric perspective on change history where it captures the way we talk and think about changes.
Fossil merges do this. More people need to use Fossil; it's got a ton of great ideas.
"If I split a file in two I still want to be able to see blame correctly for the author of the function, not one file as freshly created and the other with a bunch of deletes."
Now this is a good idea that I've never seen in a VCS.
There's a reason no one has done that, the VCS would have to have a semantic understanding of what it's tracking. I'm sure that's possible, but I think would see extremely limited success. Honestly, it may have even been done for proprietary languages and VCS systems that have since faded into obscurity.
I'd settle for searching the git history for a particular regex/string and then running a blame on that.
2) language sensitive version control seems like the next thing. We need like an LSP for VCSes.
`git split`
Something that I enjoy with jujutsu is that the semantics is the tool itself. ONCE you do that, the rest become easier!
Many things were forever until they suddenly died, but I think this is especially true for git.
I'm not saying this as a git hater, quite to the contrary. I think git is great. I also think git is an ill-fit for the majority of modern commercial software projects and there will be a breaking point where companies realize that and move on.
There's plenty of workarounds too, but that's what they are. Workarounds.
With other backends, it inherits their problems. But also their solutions :) So with those backends, it could!
2. rebase based merge strategies - our team has 50+ devs across three continents merging into monorepo with teams maintaining submodules. By the time your merge request passes CI it has to be rebased. People are literally holding off on reviewing merge requests to make sure their own changes get in first
3. permissions for subdirectories/assets. some necessary code/modules are highly regulated and company secrets. Git cant lock certain directories based on who clones the repo
4. Agentic coding - if you don't commit then your changeset after each request is lost. JJ solves this. You could just say to commit after every request then squash the commits. But, I think this is an ergonomic argument
5. Maybe it's just my experience, but git-lfs is pretty annoying to manage on large teams and changing files to/from lfs. often easier to just delete and clone again
6. git blame on non-meaninful changes. Running a code linter to add/remove whitespace makes git blame return who ran the linter rather than who wrote the code
7. self-reported identity. every time we get new laptops (because they buy the cheapest POS) devs forget what they set for 'username'. so it ends up being 3-4 different identities with the same email
Those are just my complaints lately
2. That has the smells of a wrong code architecture. If change request leads to unneeded code conflicts, you need to rework your code architecture.
3. That’s valid, but why not create libraries out of those modules?
4. Valid. But I think the issue is on the agent side. Git has already all the features to make those happen, it’s the agent that is not integrated with git.
5…
6. Either than sweeping changes (adding a formatter, changing config,…) There’s no need for formatting changes to be its own commit in the main repo. I usually add a check to prevent inconsistent formatting.
7. The git history has the previous username and email recorded alongside each commit.
(Compare to Mercurial, Fossil or Git; those systems have consistent and usable interfaces. There's much less demand for wrappers or LLM tooling since they're easy to use already.)
2. Preservation of history. Two common commands - git rebate and git push -f - cause commit history to be lost, sometimes permanently. ("Just be careful" and "Just don't use those commands" are useful pieces of advice for an individual, and virtually impossible to enforce over groups.)
3. Conflict resolution. Git forces the user to resolve conflicts ASAP so we often lose information about A. What the conflict exactly was, and B. How the individual resolved it. Most VCS have this issue; JJ allows you to commit the conflict and solve it in a separate commit, which is nice.
2. https://git-scm.com/docs/git-reflog
It’s very hard to loose data in git.
3. The goal of writing code is to have working software. Conflict messages are like compiler warnings, better have them than getting errors slipping by unnoticed. If A conflict with B, the root cause is often a design conflict, which means that the design of the software is inconsistent.
The conflict only matters as long as it’s not been solved. For each commit, the design of the software need to be consistent, and the succession of commit describe the evolution of the design. A is not lost, B is not lost in the case of a merge and may stay for a long time when rebasing. C which solves the difference between A and B (and may replace B) is also consistent. I don’t care about inconsistency.
* monorepo megarepos - but you kinda need system built from scratch that sacrifices a lot in other places to handle that in the first place * media asset heavy repositories - again, different paradigm. Stuff that make Git great like full local history just become impossible to do sensibly when amount of changes per day is hundreds of megabytes.
Most projects don't git that. And for majority git + LFS is enough.
I thought I wouldn’t because it’s just another git - but git worktrees are a PITA.
Can I suggest though to focus the readme on the lighting fast checkout for multi agent loads? That seems to be the big selling point and is the real win over git.
I think other commenters here are missing the point - it’s not “for agents” in that the API is somehow agent friendly. Of course git being omnipresent in the training data gives it a one-up. It’s “for agents” in that it aligns with a multi-checkout workflow better than git does.
> You can work on many tasks in parallel without needing to download everything or fight worktrees.
What does "download everything" even mean? Why would you "fight worktrees"?
That said, it hasn't been enough of an issue for me to want to fix.
Awesome. How does one decide which files should be stored externally, and manage that? And where is that decision stored?
LFS "solves" those but it does it really badly. Really really badly. I've probably forgotten all the ways, but at least:
1. It conflates the content with the storage mechanism. You can't change retrospectively how the files are stored, even though the only thing you really need to be immutable is their content.
2. It requires you to actively set up git-lfs, otherwise it silently does the wrong thing.
3. Not exactly LFS's fault but I have yet to find a forge (GitHub, Gitlab etc.) which exposes the LFS stored files in a sane way. Last time I tried it was basically impossible to delete old files, and you needed a lot of extra work to even enable LFS in the place.
https://epicgames.github.io/lore/explanation/system-design/ if not
[1] https://git-scm.com/book/en/v2/Git-and-Other-Systems-Git-as-...
Like replacing streets with rails while we claim FSD works out of the box
YAAS (yet another agentic substrate)
* The core idea sounds interesting. Make it the first paragraph, not paragraph seven.
* Spend more words describing what makes Oak different.
* "I built a version control system in my free-time called Jam". You probably didn't name your free time. "I built a version control system, called Jam, in my free time."
The only thing to go on is this single sentence: "With virtual mounts, agents locally and in the cloud no longer need a full copy of a repo to get working."
> For the first 100 users that subscribe to a paid plan I will send you a personalized e-ink display
I don't understand anyone who feels incentivized by this. Brogrammer 2.0 is weird.
I hope this is ok! If you prefer different text at the top, let us know at hn@ycombinator.com.
I found the section titled “Local feature branches. Server main. One squash.” most interesting.