Sidekick: Local-first native macOS LLM app

325 points by volemo 4 months ago

What an great looking tool.

Amazingly generous that it’s open source. Let’s hope the author can keep building it, but if they need to fund their existence there is precedent - lots of folks pay for Superwhisper. People pay for quality software.

In a past tech cycle Apple might’ve hired the author, acquired the IP and lovingly stewarded the work into a bundled OS app. Not something to hope for lately. So just going to hope the app lives for years to come and keeps improving the whole way.

antonkar 4 months ago

Yep, I think the guy who’ll make the GUI for LLMs is the next Jobs/Gates/Musk and Nobel Prize Winner (I think it’ll solve alignment by having millions of eyes on the internals of LLMs), because computers became popular only after the OS with a GUI appeared. I just started ASK HN to let people and me share their AI safety ideas, both crazy and not: https://news.ycombinator.com/item?id=43332593
- kridsdale1 4 months ago
  
  What you describe seems to be OpenAI’s “moat”. They currently are the farthest ahead in app UX for nontechnical users and in Brand Recognition. It doesn’t matter if they are 10% behind Anthropic in frontier model quality if Claude For Mac is a shitty Electron app.
  - antonkar 4 months ago
    
    I miscommunicated( I meant new 3D game-like UIs. There can be a whole new OS full apps that represent multimodal LLMs in human-familiar ways. All the UI now is what I consider commandline-like. They are like a strict librarian that only spits quotes, no one allows you to truly enter the library. We need better 3D and even “4D” long exposure photo like UIs
  - brulard 4 months ago
    
    Claude for Mac works quite well for me, and now with these MCP servers looking better than ever. And regarding electron - I have seen (and created for myself) awesome apps that would never existed without it.
    
    knowaveragejoe 4 months ago
    
    The only thing missing from the MCP component of Claude Desktop is a better interface for discovering, enabling and configuring different MCP servers.
    
    gcanyon 4 months ago
    
    What in your mind is the main advantage for using the app over the web site?
    
    brulard 4 months ago
    
    MCP servers work only on desktop app. Otherwise I don't mind using web app. On the desktop version I miss I can not have (at least i didn't figure it out) multiple conversations open at once. So I use web a lot.
- DrBenCarson 4 months ago
  
  The ideal UX isn’t a secret: audio with AR for context
  I’m bullish on an AirPods-with-cameras experience
- piyuv 4 months ago
  
  I’ve read the same thing about cryptocurrencies for so long (it needs a proper GUI to take off)
  - antonkar 4 months ago
    
    I miscommunicated( I meant new 3D game-like UIs. There can be a whole new OS full apps that represent multimodal LLMs in human-familiar ways. All the UI now is what I consider commandline-like. They are like a strict librarian that only spits quotes, no one allows you to truly enter the library. We need better 3D and even “4D” long exposure photo like UIs
    
    piyuv 4 months ago
    
    I got what you mean. People have said cryptocurrencies are one UX revolution away from mainstream adoption since its inception. The reality was/is, it’s a solution in search of a problem.
    
    antonkar 4 months ago
    
    Who said it and how it relates to what I wrote? You’re majorly straw-manning what I proposed

mentalgear 4 months ago

Looks great, kudos for making it open-source! Yet as with any app that has access to my local file system, what instantly comes to mind is "narrow permissions" / principle of least permissions.

It'd be great if the app would only have read access to my files, not full disk permission.

As an end-user, I'm highly concerned that files might get deleted or data shared via the internet.

So ideally, Sidekick would have only "read" permissions and no internet access. (This applies really to any app with full disk read access).

Also: why does it say Mac Silicon required? I can run Llama.cpp and Ollama on my intel mac.

wobfan 4 months ago

> It'd be great if the app would only have read access to my files, not full disk permission.
I'm running it right now and macOS didn't ask for any permissions at all which afaik means that it cannot access most of my personal folders and definitely not the full disk. Am I missing something?
- LoganDark 4 months ago
  
  > I'm running it right now and macOS didn't ask for any permissions at all which afaik means that it cannot access most of my personal folders and definitely not the full disk. Am I missing something?
  This is only true if it uses App Sandbox, which is mandatory for apps distributed through the App Store, but not necessarily everything else
BestHackerOnHN 4 months ago

Comments like this are what turn me off about this website. Entitled, much?
- Angostura 4 months ago
  
  Apart from the M1 comment, I think the commenter voices a reasonable concern and isn't hostile or antagonistic
- colecut 4 months ago
  
  He complimented the app and made a decent suggestion..

abroadwin 4 months ago

Neat. It would be nice to provide an option to use an API endpoint without downloading an additional local model. I have several models downloaded via ollama and would prefer to use them without additional space being taken up by the default model.

Terretta 4 months ago

From the README:
Optionally, offload generation to speed up generation while extending the battery life of your MacBook.
Screenshot shows example, mentions OpenAI and gpt-4o.
- abroadwin 4 months ago
  
  But it still forces you to download a local model before you can use that feature.

pzo 4 months ago

some other alternatives (a little more mature / feature rich):

anythingllm https://github.com/Mintplex-Labs/anything-llm

openwebui https://github.com/open-webui/open-webui

lmstudio https://lmstudio.ai/

toomuchtodo 4 months ago

Any recommendations wrt resources that compare these or provides more context in that regard?
- pzo 4 months ago
  
  I haven't played with more advanced feature or RAG, but at least AnythingLLM seems to support both local and remote vector databases (you can choose: lancedb, chroma, pinecone, zilliz cloud), embedding provider again both local (ollama, lmstudio) and remote and adjust text splitting and chuncking. Have some Agent skills like Web Search, scraping websites, sql connector. Seems probably the most advanced and I like you can still use ollama or lmstudio as your main LLM models repository.
  OpenWebUI a little bit more harder to setup (need docker), but interface very similar to OpenAI
  - derrasterpunkt 4 months ago
    
    When I remember correctly the non-docker install instructions were a bit buried on GitHub but Open WebUI runs fine in a venv.

unshavedyak 4 months ago

Looks super neat!

Somewhat related, one issue i have with projects like these is it appears like everyone is bundling the UX/App with the core ... pardon my ignorance, "LLM App interface". Eg We have a lot of abstractions for LLMs themselves such as Llama.cpp, but it feels like we lack abstractions for things like what Claude Code does, or perhaps this RAG impl, or whatever.

Ie these days it seems like a lot of the magic in a quality implementation is built on top of a good LLM. A secondary layer which is just as important as the LLM itself. The prompt engineering, etc.

Are there any attempts to generalize this? Is it even possible? Feels like i keep seeing a lot of good ideas which get locked behind an app wall and no ability to switch them out. We've got tons of options to abstract the LLMs themselves, but i've not seen anything which tackles this (but i've also not been looking).

Does it exist? Does this area have a name?

Terretta 4 months ago

On MacOS, look at things like Msty.app (and of course LM Studio)?
They are pluggable across more than just LLM itself.
- bastardoperator 4 months ago
  
  I went with msty because I didnt want to run docker and it's been rock solid for my needs.
derefr 4 months ago

I'm not sure there's a market in LLM "middleware" per se. Look at the market segments:
• B2C: wants vertically-integrated tools that provide "middleware" plus interface. Doesn't want to dick around. Often integrates their own service layer as well (see e.g. "character chat" apps); but if not, aims for a backend-integration experience that "knows about" the quirks of each service+model family, effectively commoditizing them. The ultimate aim of any service-generic app of this type is likely to provide an "subscription store" where you purchase subscriptions to inference services through the app, never visiting the service provider itself.
• B2B (think "using agents to drive pseudo-HFT bots for trades in fancy financial instruments that can't be arbitraged through dumb heuristics"): has a defined use-case and wants to control every detail of both "middleware" and backend together. Vertically integrates their own solution — on-prem inference cluster + custom-patched inference engine + business logic that synthesizes the entire prompt at every step. Doesn't bother with the "chat" abstraction other than as part of several-shot prompting.
• B2B2C: wants "scaling an inference cluster + engine + model deployment" to be Somebody Else's Problem; thinks of an "app agent experience" as the deliverable they want to enable their business customers to achieve through their product or service; and thus thinks of "middleware" as their problem / secret sauce — the thing they will build to enable "app agent experiences" to be created with the least business-customer effort possible. The "middleware" is where these B2B2C businesses see themselves making money. Thus, these B2B2C businesses aren't interested in paying some other middleman for a hosted "generic middleware framework as a service" solution; they're interested in being the only middleman, that captures all of the margin. They're interested in library frameworks they can directly integrate into their business layer.
---
For an analogy, think of the "middleware" of an "easy website builder" service like Squarespace/Wix/etc. You can certainly find vertically-integrated website-builder services; and you can also find slightly-lower-level library components to do what the "middleware part" of these website-builder services do. But you can't find full-on website-builder frameworks (powerful enough that the website-builder services actually use them) — let alone a white-labelable headless-CMS + frontend library "website builder builder" — let alone again, a white-labelable headless-CMS "website builder builder" that doesn't host its own data, but lets you supply your own backend.
Why?
Because B2C businesses just want Squarespace itself (a vertically-integrated solution); B2B businesses don't want an "easy website builder", they want a full-on web-app framework that allows them to control both the frontend and backend; and B2B2C businesses want to be "the Squarespace of X" for some vertical X, using high-ish-level libraries to build the highest-level website-building functionality, while keeping all of that highest-level glue code to themselves, as their proprietary "secret sauce." (Because if they didn't keep that highest-level code proprietary, it would function as a "start your own competitor to our service in one easy step" kit!)
---
The only time when the "refined and knowledge-enriched middleware abstraction layer -as-a-Service — but backend-agnostic!" approach tends to come up, is to serve the use-case of businesspeople within B2B orgs, who want to be able to ask high-level questions or drive high-level operations without first needing to get a bespoke solution built by the engineering arm of said org. This is BI software (PowerBI), ERP software (NetSuite), CRM software (Salesforce), etc.
The weird / unique thing about LLMs, is that I don't think they... need this? The "thing about AI", is precisely that you can simply sit an executive in front of a completely-generic base-model chat prompt, and they can talk their way into getting it to do what they want — without an engineer there to gather + formalize their requirements. (Which is not to say that the executive can get the LLM to build software, correctly, to answer their question; but rather, that the executive can ask questions that invoke the agent's inbuilt knowledge and capabilities to — at least much of the time — directly answer the executive's question.)
For LLMs, the "in-context learning" capability mostly replaces "institutional knowledge burned into a generic middleware." Your generic base-model won't know everything your domain-specialist employees know — but, through conversation, it will at least be able to know what you know, and work with that. Which is usually enough. (At least, if your goal was to get something done on your own without bothering people who have better things to be doing than translating your question into SQL. If your goal is to work around the need for domain expertise, though... well, I don't think any "middleware" is going to help you there.)
In short: the LLM B2C use-case is also the LLM "B2Exec" use-case — they're both most-intuitively solved through vertical integration "upward" into the backend service layer. (Which is exactly why there was a wave of meetings last week, of businesspeople asking whether they could somehow share a single ChatGPT Pro $200/mo subscription across their team/org.)

atonse 4 months ago

When I bought my new MBP, I was wondering whether to just upgrade the memory to 48GB thinking that it will become more likely that I will run local models in the next 3-4 year cycle of this laptop. So I took the leap and just upgraded the memory.

Hoping that these kinds of tools will run well in these scenarios.

Isn0gud 4 months ago

This is not local, but uses the Tavily cloud (https://tavily.com/) ?!

fwip 4 months ago

Just downloaded it and mucked about. It definitely works without the cloud, because it works while I'm offline. Looking at the code, it looks like an opt-in feature where you can provide your API key to Tavily.
That said, it seems built toward "Cheat on your homework" and doesn't reliably surface information from my notes, so I uninstalled it.
CharlesW 4 months ago

Sure seems to. https://github.com/search?q=repo%3Ajohnbean393%2FSidekick%20...
- nottorp 4 months ago
  
  This is getting murky.
  As a suggestion to the author, please try to make it verifiably local only with an easy to set option.
  - nclements 4 months ago
    
    Tavily search is an option, disabled by default.
    Maybe the author could make a note of that in the README.

rubymamis 4 months ago

Some interesting features. I'm working on similar native app with Qt so it will support Linux, macOS and Windows out of the box. I might open source it as well.

https://www.get-vox.com

AnonC 4 months ago

Looks nice, and I greatly appreciate the local only or local first mode.

The readme says:

> Give the LLM access to your folders, files and websites with just 1 click, allowing them to reply with context.

…

> Context aware. Aware of your files, folders and content on the web.

Am I right in assuming that this works only with local text files and that it cannot integrate with data sources in Apple’s apps such as Notes, Reminders, etc.? It could be a great competitor to Apple Intelligence if it could integrate with apps that primarily store textual information (but unfortunately in their own proprietary data formats on disk and with sandboxing adding another barrier).

Can it use and search PDFs, RTF files and other formats as “experts”?

Someone 4 months ago
> Am I right in assuming that this works only with local text files
One of the screen shots shows a .xlsx in the “Temporary Resources” area.
Also: I haven’t checked, but for a “Local-first” app, I would expect it to leverage Spotlight text importers from the OS, and run something like
```
  mdimport -t -d3 *file*
```
on files it can’t natively process.
kridsdale1 4 months ago

The Apple data you mention has APIs for feeding them in to LLMs if you wish. Someone just has to write it.
(I wrote one of those Apple API SDKs)

pvo50555 4 months ago

What differentiates this from Open WebUI? How did you design the RAG pipeline?

I had a project in the past where I had hundreds of PDF / HTML files of industry safety and fatality reports which I was hoping to simply "throw in" and use with Open WebUI, but I found it wasn't effective at this even in RAG mode. I wanted to ask it questions like "How many fatalities occurred in 2020 that involved heavy machinery?", but it wasn't able to provide such broad aggregate data.

phren0logy 4 months ago

I think this is a fundamental issue with naive RAG implementations: they aren't accurate enough for pretty much anything
- kridsdale1 4 months ago
  
  Ultimately, the quality of OCR on PDF is where we are bottlenecked as an industry. And not just in text characters but understanding and feeding to the LLM structured object relationships as we see in tables and graphs. Intuitive for a human, very error prone for RAG.
  - phren0logy 4 months ago
    
    That's a real issue, but that's masking some of the issues further downstream, like chunking and other context-related problems. There are some clever proposals to make this work, including some of the stuff from Anthropic and Jina. But as far as I can tell, these haven't been tested thoroughly because everyone is hung up at the OCR step (as you identified).
    
    pvo50555 4 months ago
    
    For my purposes, all of the data was also available in HTML format, so the OCR wasn't a problem. I think the issue is the RAG pipeline doesn't take the entire corpus of knowledge into its context when making a response, but uses an index to find one or more relevant documents that it believes are relevant, then uses that small subset as part of the input.
    I'm not sure there's a way to get what a lot of people want RAG to be without actually training the model on all of your data, so they can "chat with it" similar to how you can ask ChatGPT about random facts about almost any publicly available information. But I'm not an expert.
  - jredwards 4 months ago
    
    I've also observed this issue and I wonder where the industry is on it. There seem to be a lot of claims that a given approach will work here, but not a lot of provably working use cases.

nebulous1 4 months ago

The name gave me a flashback to Borland Sidekick

freetanga 4 months ago

Was this the MS-DOS TSR app that kept running in the background and you could invoke at any time? Fond memories!
- noufalibrahim 4 months ago
  
  I was going to say the same thing. It had so many cool tools. A calculator, ascii chart, notepad, calendar. And the whole idea of a tsr opened a door in my head which hadn't seen multiple programs running at the same time till then.
- nebulous1 4 months ago
  
  That's the one. Nifty little program.
jasonjmcghee 4 months ago

I thought of the phone with the spinning screen from the mid 2000s.

john_alan 4 months ago

Pretty slick, I've been using Ollama + https://github.com/kevinhermawan/Ollamac - not sure this provides much extra benefit. Still love to see it.

aa-jv 4 months ago

Trying to put this through its paces, I first set out to build my own local binary (because why not, and also because code-reading is fun when you've got your own local build) ..

But I get this far:

/Users/aa-jv/Development/InterestingProjects/Sidekick/Sidekick/Logic/View Controllers/Tools/Slide Studio/Resources/bin/marp: No such file or directory

It seems there is a hand-built binary resource missing from the repo - did anyone else do a build yet, and get past this step?

Gracana 4 months ago

Looks like it wants https://github.com/marp-team/marp-cli
- aa-jv 4 months ago
  
  Yeah, I've manually copied that binary into place from the marp-cli package in homebrew and now the build proceeds .. continuing as I type .. lets see what happens.
  I'm immediately suspicious of such binary resources, however.

typeiierror 4 months ago

I've been looking for something like this to query / interface with the mountain of home appliance manuals I've hung onto as PDFs - use case being that instead of having to fish out and read a manual once something breaks, I can just chat with the corpus to quickly find what I need to fix something. Will give it a shot!

angst_ridden 4 months ago

Any idea how long Experts should take to import/index data? I pointed an expert at a big directory of source files on a M4 iMac with 32G RAM, and it pinned a CPU at 100% for 24 hours but was not finished.

A single file seems to finish quickly, but folders (even with just a few files) seem to be very slow.

AutoAPI 4 months ago

An option to use a local LLM on network without needing to download the 2GB "default model" would be great

yohannparis 4 months ago

It's in the README https://github.com/johnbean393/Sidekick?tab=readme-ov-file#f...
- AutoAPI 4 months ago
  
  It still forces you to download a model regardless
  - yohannparis 3 months ago
    
    Great, you can submit a PR, they welcome them.

delijati 4 months ago

Does anyone know if there is something like this or https://github.com/kevinhermawan/Ollamac for linux ... both are build with swift and swift also supports linux!?

Matl 4 months ago

Desktop-wise there's https://msty.app which is rather good but not open source. I'm using OpenWeb UI [1] with a desktop shortcut but that's a web app.
1 - https://github.com/open-webui/open-webui
- k2enemy 4 months ago
  
  https://jan.ai is open source and works on linux.

toomuchtodo 4 months ago

Great work! Please consider a plugin mode to support integrating with Dropbox, S3 compatible targets, where users might be storing large amounts of data off device (but still device accessible), as well as email providers via IMAP/JMAP.

sickmartian 4 months ago

Very cool, trying it out, I'm unable to make it do a search tho, on the experts it says it's deactivated on the settings but I couldn't find a setting for it, maybe it's model dependent and the default model can't do it?

webprofusion 4 months ago

Nice, just needs a computer/browser use mode and thinking/agent mode. e.g. "Test this web app for me. Try creating a new account and starting a new order" etc.

gcanyon 4 months ago

Does it actually highlight the cites when opening cite docs? Or were the highlights in the screen shot just there by chance?

user99999999 4 months ago

Looking forward to when there will be a broad llm api accessible in the browser via js

sansieation 4 months ago

Why no MLX?

TheMagicHorsey 4 months ago

I think it uses Llama.cpp, which doesn't support MLX.

thomasfl 4 months ago

This needs 164 MB of disk space. Not to bad. Thank you to the author for this.

Telemakhos 4 months ago

That's just the binary. It needs at least another order of magnitude beyond that to download the model.

dev213 4 months ago

looks like an awesome tool! I just found it funny that in code interpreter demo, javascript is used to evaluate mathematical problems (especially the float comparison)

oigursh 4 months ago

Could this help categorize and prune backups?

whoitsjustme 4 months ago

Does it support MCP?

MichaelTheGeek 4 months ago

Very nice.

nottorp 4 months ago

> Image generation is availible on macOS 15.2 or above, and requires Apple Intelligence.

... so image generation is not fully offline?

This tool looks like it could be worth a try to me, but only if I'm sure I can run it into a mode that's fully offline.

greggsy 4 months ago

Some features use private cloud, but it's pretty decent in terms of security and privacy.
https://security.apple.com/blog/private-cloud-compute/
- nottorp 4 months ago
  
  "Pretty decent" is irellevant for proprietary code that's not my property.
  The only safe option is if it's guaranteed to not leave my machine, so for this app, to disable anything that has a chance of exfiltrating data.
  - Terretta 4 months ago
    
    I encourage you to try to understand how verifiable transparency works.
    https://security.apple.com/blog/pcc-security-research/
    
    nottorp 4 months ago
    
    I doubt my customer - on whose proprietary code I want to try running LLMs - cares :)
    Or to rephrase: would you go to court with the contents of that link as evidence that you haven't inadvertently published someone else's proprietary data in some external database?
happyopossum 4 months ago

Apple Intelligence image generation is fully offline
argsnd 4 months ago

isn't apple intelligence image generation fully offline?
- nottorp 4 months ago
  
  I don't know, I'm asking.
  Only want it for some code so it looks like it can be fully offline, but it's worth being paranoid about it.

aneutron 4 months ago

[flagged]

mritchie712 4 months ago

I don't think Apple has missed out on much (yet). The best LLM's (e.g. gpt4o, sonnet 3.7) are no where near being able to run locally and still make mistakes.
Some LLMs can run locally, but are brutally slow with small context windows.
Apple is likely waiting until you can run a really good model on device (i.e. iOS), which makes sense to me. It's not like they're losing customers over this right now.
- pantulis 4 months ago
  
  They are playing the long game, which is what has always been: wait until the silicon enables that for most users. The Apple Silicon track record suggests that... wait a couple of years and we'll get M3-Ultra-class capabilities in all of Apple devices. Some day the lowest bar will be above running state of the art LLMs on device.
- Wowfunhappy 4 months ago
  
  Siri hasn't run on device for most of its existence. It's only in the last few years that Apple suddenly decided it was a priority.
- aneutron 4 months ago
  
  All they have to show is incremental improvements over Siri. For that, Quantized models are more than enough in my opinion.
- cpursley 4 months ago
  
  Sonnet 3.7 best? That thing is a dumpster fire. Totally useless vs 3.5.
kossTKR 4 months ago

Just checked some Genmojis created on reddit, wow, i don't know how that got approved. I'm all for creativity and freedom but it's 100% not apples brand.
And they just postponed AI-Siri to 2026 after promising it for iPhone 16. I seriously don't get how it can be that hard. Small model trained on various app API's, a checker model that double checks, an approve this action button. Not that hard.

0xferruccio 4 months ago

Really cool! I hope they'll roll out MCP support so that we can add support for it in our MCP app store (https://github.com/fleuristes/fleur)

Right now only code editors and Claude support MCPs, but we'd love to see more clients like Sidekick