The first lecture is so good. Not only from perspective of content, but how Zhao explain things about how to think about learning as a student. ty for recommendation.
I came across this a few days ago, and my excuse to give it a a serious look is that Andreas Krause has some deep and interesting research in Gaussian Processes and Bandits [1].
It's Krause, he's one of the biggest researchers in the field. At least based on the other work of his I've read, he's a good writer too. This ought to be a worth while read.
Existential Reality is potential distribution not arrangement of states.
Potential exists, probability is a mathematical description of its distribution. Every attribute is a dimension (vector). State is merely a passing measurement of resolve. Potential interacts through constructive and destructive interference. Constructive and destructive interference resolve to state in a momentary measure of “now” (an inevitability decaying proposition.)
Existential Reality is potential distributing, not arrangements of state.
stupid question: can a LLM (i.e neural network) tell me the probability of the answer it just spew? i.e. turn into fuzzy logic? Aaand, can it tell me how much it does believe itself? i.e. what's the probability that above probability is correct? i.e. confidence i.e. intuitionisticaly fuzzy logic?
Long time ago at uni we studied these things for a while.. and even made a Prolog interpreter having both F+IF (probability + confidence) coefficients for each and every term..
You can say "give me a % chance you think this thing will happen and why" and it will spit out a lot of context behind it's thinking, I'm not a math guy and I'm aware "probability" has some more complex math stuff in it, but just from a "why do you believe this so strongly?" perspective, I've personally found it's able to have me agree or disagree fairly. You can then give it additional context you know about, and it will refine it's estimation. Basically I've started treating them like context connection systems, just to look to see if dots even possibly could connect before I do the connecting myself.
Not out of the box I think; I wouldn’t trust any self-assesment like that. With enough compute, you could probably come up with a metric by doing a beam search and using an LLM to evaluate how many of the resultant answers were effectively the same as a proxy for “confidence”.
Similar to bootstrapping a random variable in statistics. Your N estimates (each estimate is derived from a subset of the sample data) give you an estimate of the distribution of the random variable. If the variance of that distribution is small (relative to the magnitude of the point estimate) then you have high confidence that your point estimate is close to the true value.
Likewise in your metric, if all answers are the same despite perturbations then it's more likely to be ... true?
I'd really like to see a plot of your metric versus the SimpleQA hallucation benchmark that OpenAI uses.
The way I understand it, an LLM response is a chain of tokens where each is the most probable token. Maybe there exists more complicated candidate and selection approaches than that, but biggest number works for me. For the sake of simplicity, let's just say tokens are words. You'd have access to the probability of each word in the ordering of the sentence, but I'm not sure how that would then be used to evaluate to the probability of the sentence itself or its truthiness.
According to the following paper, it's possible to get calibrated confidence scores by directly asking the LLM to verbalize a confidence level, but it strongly depends on how you prompt it to do so:
Suitably modified, they can. Bayesian neural networks provide uncertainty quantification. The challenge is calibrating the predictions, and deciding whether devoting model capacity to uncertainty quantification would not be better spent on a bigger, uncertain model.
Maybe stupid answer, but I’ve read a few older papers that used ensembles to identify when a prediction is out of distribution. Not sure what SotA approach is though.
I think we’ll need a GUI for the models to democratize interpretability and let even gamers explore them. Basically to train another model, that will take the LLM and convert it into 3D shapes and put them in some 3D world that is understandable for humans.
Simpler example: represent an LLM as a green field with objects, where humans are the only agents:
You stand near a monkey, see chewing mouth nearby, go there (your prompt now is “monkey chews”), close by you see an arrow pointing at a banana, father away an arrow points at an apple, very far away at the horizon an arrow points at a tire (monkeys rarely chew tires).
So things close by are more likely tokens, things far away are less likely, you see all of them at once (maybe you’re on top of a hill to see farther). This way we can make a form of static place AI, where humans are the only agents
I had a mind-bending Salvia trip at eighteen that went sort of like that.
My mind turned into an infinitely large department store where each aisle was a concurrent branch of thought, and the common ingredient lists above each aisle were populated with words, feelings and concepts related to each branch.
The PA system replaced my internal monologue, which I no longer had, but instead I was hearing my thoughts externally as if they were another person's.
I was able to walk through these aisles and marvel at the immense, fractal, interdependent web of concurrent thought my brain was producing in realtime.
“When I began to navigate psychospace with LSD, I realized that before we were conscious, seemingly self-propelled human beings, many tapes and corridors had been created in our minds and reflexes which were not of our own making. These patterns and tapes laid down in our consciousness are walled off from each other. I see it as a vast labyrinth with high walls sealing off the many directives created by our personal history.
Many of these directives are contradictory. The coexistence of these contradictory programs is what we call inner conflict. This conflict causes us to constantly check ourselves while we are caught in the opposition of polarity. Another metaphor would be like a computer with many programs running simultaneously. The more programs that are running, the slower the computer functions. This is a problem then. With all the programs running that are demanded of our consciousness in this modern world, we have problems finding deep integration.
To complicate matters, the programs are reinforced by fear. Fear separates, love integrates. We find ourselves drawn to love and unity, but afraid to make the leap.
What I found to be the genius of LSD is that it really gets you high, higher than the programs, higher than the walls that mask and blind one to the energy destroying presence of many contradictory but hidden programs. When LSD is used intentionally it enables you to see all the tracks laid down, to explore each one intensely. It also allows you to see the many parallel and redundant programs as well as the contradictory ones.
It allows you to see the underlying unity of all opposites in the magic play of existence. This allows you to edit these programs and recreate superior programs that give you the insight to shake loose the restrictions and conflicts programmed into each one of us by our parents, our religion, our early education, and by society as a whole.”
~ Nick Sand, 2001, Mind States conference, quoted in Casey Hardison's obituary
I feel like if all the things people believe and espouse about hallucinogens were true and not just the effect of permanently damaging your mind, with the illusion of wisdom, we’d be able to point at all the revolutionary scientific breakthroughs and discoveries made under the influence of hallucinogenic substances.
However, everyone I’ve met who admits to having taken hallucinogens seems reduced in some way, rather than enhanced. Like the lights are on but someone else is home.
This was sort of my experience with LSD. It just broke me. I fell into a deep depression afterward, but the reason was only partly due to damaging my mind. The other part of it was that the LSD made me realize where my life was going, and how completely unfulfilled I'd end up being in 10-20 years. In that way, it helped me course-correct. I'm healthier, more honest with myself, and got back into college because of the experience.
But it did damage my mind. I have mild to moderate anhedonia now. Weed hits me completely differently now (feels more like strong caffeine + brain fog instead of any pleasure). I lost my desire to write creatively.
Unless you did a thumbprint, you're perfectly fine, no damage. Just get your shit together, that seems to be your takeaway. Sounds like it worked. Now you have to keep working on yourself rather than blaming a harmless drug for your problems.
The mechanism for water toxicity is physically displacing ions in the brain, causing osmotic damage. (This also explains why lower amounts of water are safe.) A thumbprint dose of LSD is nowhere near high enough to do something like that.
It's similar to one aspirin vs 10000. Also, people do thumbprints and live normal lives after, I just threw that out as an extreme. LSD is very safe if you're doing normal doses. Most people don't take 10000 doses so I can't really speak to that.
There is the Qualia Research Institute, one of the things they do is using dope and making simulations of the experience. They basically found 2 main types of drugs:
1. Most “create more separate personalities” in you
2. One (the “toad poison”) actually makes you feel like a giant place, the feeling is usually pleasant.
So there is either “agentification” of you into more “agents” or “space-ification” into one giant place without any agency in it. I think we can make this static place AI and it’ll be safe by definition because we’re the only agents in it.
It happens all the time. DNA double helix is a good example. You really think people are going to mention their drug use in white papers? I think not. Nothing to gain and everything to lose.
Please explain how they are equivalent and why the psychoactive properties of a psychedelic don't make it categorically different in this situation than a carrot.
> It was the Austrian botanist Friedrich Reinitzer and the German physicist Otto Lehmann who discovered liquid crystals in 1888 when they were experimenting with the natural substances found in carrots
Carrots have also inspired art, I don't need to find an example of that.
That said, if you view that premise as it being a given that X has inspired something just because lots of people do it, then it is indeed preposterous.
However, if you view it as "a critical mass of people have done mind-altering substances and thus it's almost a certainty that discoveries have been made under and due to their influence", then it makes a lot more sense. It's a fundamentally different argument and you can't generalize it to any non-mind-altering substance. The point is that the mind is altered by the substance, causing different modes of thought. I think you were just confused about the way the other commenter laid out their premise, because it's a purely statistical argument with the qualifier that we're talking about substances that measurably alter thought processes.
The guy who’ll make the GUI for LLMs is the next Jobs/Gates/Musk and Nobel Prize Winner (I think it’ll solve alignment by having millions of eyes on the internals of LLMs), because computers became popular only after the OS with a GUI appeared. I recently shared how one of its “apps” possibly can look: https://news.ycombinator.com/item?id=43319726
If you feel like being a hippie, you can find the "rendering engine for reality" in here: Mandelbrot (1980) – The Mandelbrot Set and fractal geometry
Julia (1918) – Memoire sur l'iteration des fonctions rationelles (Julia sets)
Meyer (1996) – Quantum Cellular Automata (procedural complexity)
Wolfram (1984) – Cellular automata as models of complexity
Bak et al. (1987) – Self-organized criticality. Wolfram, Gorard & Crowley (2020) - "A Class of Models with the Potential to Represent Fundamental Physics" - Kari & Culik (2009) - "Universal Pattern Generation by Cellular Automata". Just combine the papers, ofc, that is crazy - but its fun to be a bit crazy sometimes. It's one of my fav thought experiments, just for fun. :)
I used Wolfram’s physics approach to “model the absolute future and possible ethics of it”, I think it’s direct democratic simulated multiverse, you can find the link in my profile, if interested
Good point, I think at least some lossy “compression” into a GUI is possible. The guy who’ll make the GUI for LLMs is the next Jobs/Gates/Musk and Nobel Prize Winner (I think it’ll solve alignment by having millions of eyes on the internals of LLMs), because computers became popular only after the OS with a GUI appeared. I recently shared how one of its “apps” possibly can look: https://news.ycombinator.com/item?id=43319726
I found Gaussian Processes with the right kernel to be very powerful with even just a few data points and a very small set of parameters. I don't know if I was using it correctly tbh, but it worked out great in predicting values that I could not predict so accurately. I used it as a predictable yet non-linear process to tweak the input in a computer vision task. The proof was literally in the pudding.
Seems similar, or at least partially overlap, with what I would say is the best reference on the subject, an Introduction to Statistical Learning from Gareth James et al [1].
I wonder it this one might be a bit more accessible, although I guess the R/Python examples are helpful on the latter.
it’s been a while since I’ve been a beginner so I might not have the best resources, but I would recommend Harvard’s Stat 110 with Joe Blitzstein (lectures online) and then Machine Learning by Kevin Murphy. might be a scarier book to someone not confident in their math, but overall a better one imo
for something more directly comparable to the niche ISLR filled, Bishop’s books are generally better - although I can’t recall their title
I took this class at ETH Zurich and it is one of my favorite classes. Especially how do you quantify uncertainty and how they build the starting blocks of reinforcement learning. I think it’s an excellent read for data scientists and ML engineers.
From a brief search I see that it isn't (or it least not yet), but seeing how well-formatted the pdf is, and the fact that it's CC-licensed, you could print it yourself, or perhaps talk with them to organize a batch.
Though I personally prefer to read these sorts of books directly from pdf, and am grateful to them for sharing it on arxiv.
I wonder if one could organize an arXiv print service that binds and prints and ships with a unique cover and such.
Also it should use LLMs and the blockchain.
But this would be nice there are a number of papers and such that if you could submit an arXiv link to a print service I would probably buy a copy. I wonder why no one does it.
Aren't you describing Lulu but for the very niche case of arxiv publications that are small books but not published as books? I think you could do it in a weekend with their API.
This worked four years ago when the API was still launched, but there might have been changes since, so no guarantees.
Most ArXiv PDFs are probably lulu-printable out of the box, but to make a general solution, one would probably need to do some pre-processing with ghostscript (gs), e.g. embed all fonts and flatten images (no transparency).
The "book" is accompanying studying material for the course Probabilistic AI at ETH Zurich. Essentially each chapter is the material covered in one lecture (3hrs).
There's no printed version. Btw I took this course at ETHZ last year (a course with this title and whose script is this document. Pretty nice course and pretty nice course notes, happy to see that the authors decided to share it outside of the course website now!
Books suck (imho). We need a new format to teach and learn this deep technical stuff. Not youtube, something interactive with exercises and engagement.
It's a wild world where "reading the documentation" or "researching a topic" has become a career superpower. I'm glad my education largely predated social media and cell phones, and that I learned to read and work problems independently. OTOH it often makes work a very lonely, taxing experience. Being a human index into documentation is a hell of a lot less fulfilling than working with people who also can read.
I urge you to rethink this perspective. All research shows that paper increases comprehension significantly over screens and even over eink. Additionally hand note taking again has a positive impact.
Which is a major omission, as transformer-based language models are the most powerful available form of "probabilistic artificial intelligence". They predict a probability distribution over a token given a sequence of previous tokens.
My guess is that most of the content in the book is several years old (it's apparently based on an ETH Zurich class), despite the PDF being compiled this year, which would explain why it doesn't cover the state of the art.
The text has some great explanatory diagrams and looks to be a very high quality overview of ML thru the lens of probability, with lots of math.
I was also recently impressed by Zhaos "Mathematical Foundation of Reinforcement Learning", free textbook and video lectures on YT : https://github.com/MathFoundationRL/Book-Mathematical-Founda...
If you dont have a lot of time, at least glance at Zhaos overview contents diagram, its a good conceptual map of the whole field, imo .. here :
https://github.com/MathFoundationRL/Book-Mathematical-Founda...
and maybe watch the intro video.
The first lecture is so good. Not only from perspective of content, but how Zhao explain things about how to think about learning as a student. ty for recommendation.
I came across this a few days ago, and my excuse to give it a a serious look is that Andreas Krause has some deep and interesting research in Gaussian Processes and Bandits [1].
[1] https://scholar.google.com/scholar?start=10&q=andreas+krause...
It's Krause, he's one of the biggest researchers in the field. At least based on the other work of his I've read, he's a good writer too. This ought to be a worth while read.
Existential Reality is potential distribution not arrangement of states.
Potential exists, probability is a mathematical description of its distribution. Every attribute is a dimension (vector). State is merely a passing measurement of resolve. Potential interacts through constructive and destructive interference. Constructive and destructive interference resolve to state in a momentary measure of “now” (an inevitability decaying proposition.)
Existential Reality is potential distributing, not arrangements of state.
stupid question: can a LLM (i.e neural network) tell me the probability of the answer it just spew? i.e. turn into fuzzy logic? Aaand, can it tell me how much it does believe itself? i.e. what's the probability that above probability is correct? i.e. confidence i.e. intuitionisticaly fuzzy logic?
Long time ago at uni we studied these things for a while.. and even made a Prolog interpreter having both F+IF (probability + confidence) coefficients for each and every term..
You can say "give me a % chance you think this thing will happen and why" and it will spit out a lot of context behind it's thinking, I'm not a math guy and I'm aware "probability" has some more complex math stuff in it, but just from a "why do you believe this so strongly?" perspective, I've personally found it's able to have me agree or disagree fairly. You can then give it additional context you know about, and it will refine it's estimation. Basically I've started treating them like context connection systems, just to look to see if dots even possibly could connect before I do the connecting myself.
Not out of the box I think; I wouldn’t trust any self-assesment like that. With enough compute, you could probably come up with a metric by doing a beam search and using an LLM to evaluate how many of the resultant answers were effectively the same as a proxy for “confidence”.
Similar to bootstrapping a random variable in statistics. Your N estimates (each estimate is derived from a subset of the sample data) give you an estimate of the distribution of the random variable. If the variance of that distribution is small (relative to the magnitude of the point estimate) then you have high confidence that your point estimate is close to the true value.
Likewise in your metric, if all answers are the same despite perturbations then it's more likely to be ... true?
I'd really like to see a plot of your metric versus the SimpleQA hallucation benchmark that OpenAI uses.
Confidence != true
The way I understand it, an LLM response is a chain of tokens where each is the most probable token. Maybe there exists more complicated candidate and selection approaches than that, but biggest number works for me. For the sake of simplicity, let's just say tokens are words. You'd have access to the probability of each word in the ordering of the sentence, but I'm not sure how that would then be used to evaluate to the probability of the sentence itself or its truthiness.
https://plato.stanford.edu/entries/self-locating-beliefs/
Is this helpful?
I could have worded my reply better, but the simplified explanation stands :b
According to the following paper, it's possible to get calibrated confidence scores by directly asking the LLM to verbalize a confidence level, but it strongly depends on how you prompt it to do so:
https://arxiv.org/abs/2412.14737
I'm not 100% sure what you mean by this, but there is token probability available in some providers:
https://cookbook.openai.com/examples/using_logprobs
I think logprobs functionality was removed because it allowed to exfiltrate the last layer of weights of their propietary models
Suitably modified, they can. Bayesian neural networks provide uncertainty quantification. The challenge is calibrating the predictions, and deciding whether devoting model capacity to uncertainty quantification would not be better spent on a bigger, uncertain model.
https://en.wikipedia.org/wiki/Calibration_(statistics)
Example: Efficient and Effective Uncertainty Quantification for LLMs (https://openreview.net/forum?id=QKRLH57ATT)
Yes, but those probabilities tend to be poorly calibrated, especially after the tuning they get for instruction following and such.
Maybe stupid answer, but I’ve read a few older papers that used ensembles to identify when a prediction is out of distribution. Not sure what SotA approach is though.
I think we’ll need a GUI for the models to democratize interpretability and let even gamers explore them. Basically to train another model, that will take the LLM and convert it into 3D shapes and put them in some 3D world that is understandable for humans.
Simpler example: represent an LLM as a green field with objects, where humans are the only agents:
You stand near a monkey, see chewing mouth nearby, go there (your prompt now is “monkey chews”), close by you see an arrow pointing at a banana, father away an arrow points at an apple, very far away at the horizon an arrow points at a tire (monkeys rarely chew tires).
So things close by are more likely tokens, things far away are less likely, you see all of them at once (maybe you’re on top of a hill to see farther). This way we can make a form of static place AI, where humans are the only agents
I had a mind-bending Salvia trip at eighteen that went sort of like that.
My mind turned into an infinitely large department store where each aisle was a concurrent branch of thought, and the common ingredient lists above each aisle were populated with words, feelings and concepts related to each branch.
The PA system replaced my internal monologue, which I no longer had, but instead I was hearing my thoughts externally as if they were another person's.
I was able to walk through these aisles and marvel at the immense, fractal, interdependent web of concurrent thought my brain was producing in realtime.
“When I began to navigate psychospace with LSD, I realized that before we were conscious, seemingly self-propelled human beings, many tapes and corridors had been created in our minds and reflexes which were not of our own making. These patterns and tapes laid down in our consciousness are walled off from each other. I see it as a vast labyrinth with high walls sealing off the many directives created by our personal history.
Many of these directives are contradictory. The coexistence of these contradictory programs is what we call inner conflict. This conflict causes us to constantly check ourselves while we are caught in the opposition of polarity. Another metaphor would be like a computer with many programs running simultaneously. The more programs that are running, the slower the computer functions. This is a problem then. With all the programs running that are demanded of our consciousness in this modern world, we have problems finding deep integration.
To complicate matters, the programs are reinforced by fear. Fear separates, love integrates. We find ourselves drawn to love and unity, but afraid to make the leap.
What I found to be the genius of LSD is that it really gets you high, higher than the programs, higher than the walls that mask and blind one to the energy destroying presence of many contradictory but hidden programs. When LSD is used intentionally it enables you to see all the tracks laid down, to explore each one intensely. It also allows you to see the many parallel and redundant programs as well as the contradictory ones.
It allows you to see the underlying unity of all opposites in the magic play of existence. This allows you to edit these programs and recreate superior programs that give you the insight to shake loose the restrictions and conflicts programmed into each one of us by our parents, our religion, our early education, and by society as a whole.”
~ Nick Sand, 2001, Mind States conference, quoted in Casey Hardison's obituary
I feel like if all the things people believe and espouse about hallucinogens were true and not just the effect of permanently damaging your mind, with the illusion of wisdom, we’d be able to point at all the revolutionary scientific breakthroughs and discoveries made under the influence of hallucinogenic substances.
However, everyone I’ve met who admits to having taken hallucinogens seems reduced in some way, rather than enhanced. Like the lights are on but someone else is home.
This was sort of my experience with LSD. It just broke me. I fell into a deep depression afterward, but the reason was only partly due to damaging my mind. The other part of it was that the LSD made me realize where my life was going, and how completely unfulfilled I'd end up being in 10-20 years. In that way, it helped me course-correct. I'm healthier, more honest with myself, and got back into college because of the experience.
But it did damage my mind. I have mild to moderate anhedonia now. Weed hits me completely differently now (feels more like strong caffeine + brain fog instead of any pleasure). I lost my desire to write creatively.
Unless you did a thumbprint, you're perfectly fine, no damage. Just get your shit together, that seems to be your takeaway. Sounds like it worked. Now you have to keep working on yourself rather than blaming a harmless drug for your problems.
If a thumbprint can do damage, why can't a lower dose? What's the mechanism?
Water will kill you if you consume too much of it - almost universally, the poison is in the dose not the substance.
The mechanism for water toxicity is physically displacing ions in the brain, causing osmotic damage. (This also explains why lower amounts of water are safe.) A thumbprint dose of LSD is nowhere near high enough to do something like that.
It's similar to one aspirin vs 10000. Also, people do thumbprints and live normal lives after, I just threw that out as an extreme. LSD is very safe if you're doing normal doses. Most people don't take 10000 doses so I can't really speak to that.
There are lots of examples, you've just decided not to see them. Just for a start, here's an account by Ralph Abraham of the impact of psychedelics on mathematics. http://www.ralph-abraham.org/articles/MS%23124.Maps/maps2.pd...
There is the Qualia Research Institute, one of the things they do is using dope and making simulations of the experience. They basically found 2 main types of drugs:
1. Most “create more separate personalities” in you
2. One (the “toad poison”) actually makes you feel like a giant place, the feeling is usually pleasant.
So there is either “agentification” of you into more “agents” or “space-ification” into one giant place without any agency in it. I think we can make this static place AI and it’ll be safe by definition because we’re the only agents in it.
P.S. I don’t promote drugs
A fair observation, but real assumptions about progress, what it means, and what is valuable
I don't know if he credits the drug for the inspiration but the inventor of PCR was famously a user of LSD.
It happens all the time. DNA double helix is a good example. You really think people are going to mention their drug use in white papers? I think not. Nothing to gain and everything to lose.
The record of a scientific discovery that is heavily criticized for plagiarism and falsehoods is probably not a good example, actually.
Well here's one: tons of people do drugs. It's not even a question whether or not drugs have inspired discoveries. They obviously have.
Tons of people eat carrots. That doesn’t imply carrots have inspired discoveries.
That is a textbook false equivalence.
It’s exactly the same argument.
Please explain how they are equivalent and why the psychoactive properties of a psychedelic don't make it categorically different in this situation than a carrot.
The argument was: lots of people do X, so X has obviously inspired discoveries.
Carrots inspire science, too.
https://phys.org/news/2015-12-tablet-screen-brought-aid-carr...
> It was the Austrian botanist Friedrich Reinitzer and the German physicist Otto Lehmann who discovered liquid crystals in 1888 when they were experimenting with the natural substances found in carrots
Carrots have also inspired art, I don't need to find an example of that.
That said, if you view that premise as it being a given that X has inspired something just because lots of people do it, then it is indeed preposterous.
However, if you view it as "a critical mass of people have done mind-altering substances and thus it's almost a certainty that discoveries have been made under and due to their influence", then it makes a lot more sense. It's a fundamentally different argument and you can't generalize it to any non-mind-altering substance. The point is that the mind is altered by the substance, causing different modes of thought. I think you were just confused about the way the other commenter laid out their premise, because it's a purely statistical argument with the qualifier that we're talking about substances that measurably alter thought processes.
Have they? I find it likely, personally, but it doesn’t mean it’s not entirely coincidental either.
We have our first Neo candidate)
The guy who’ll make the GUI for LLMs is the next Jobs/Gates/Musk and Nobel Prize Winner (I think it’ll solve alignment by having millions of eyes on the internals of LLMs), because computers became popular only after the OS with a GUI appeared. I recently shared how one of its “apps” possibly can look: https://news.ycombinator.com/item?id=43319726
If you feel like being a hippie, you can find the "rendering engine for reality" in here: Mandelbrot (1980) – The Mandelbrot Set and fractal geometry Julia (1918) – Memoire sur l'iteration des fonctions rationelles (Julia sets) Meyer (1996) – Quantum Cellular Automata (procedural complexity) Wolfram (1984) – Cellular automata as models of complexity Bak et al. (1987) – Self-organized criticality. Wolfram, Gorard & Crowley (2020) - "A Class of Models with the Potential to Represent Fundamental Physics" - Kari & Culik (2009) - "Universal Pattern Generation by Cellular Automata". Just combine the papers, ofc, that is crazy - but its fun to be a bit crazy sometimes. It's one of my fav thought experiments, just for fun. :)
I used Wolfram’s physics approach to “model the absolute future and possible ethics of it”, I think it’s direct democratic simulated multiverse, you can find the link in my profile, if interested
no need to read it, no disagreement here.
I dont think anyone has found a good way to map higher dimensional space onto 4D visualizations, yet.
Maybe this is why tokens and language are so useful for humans ? they might be the closest analog we have.
Good point, I think at least some lossy “compression” into a GUI is possible. The guy who’ll make the GUI for LLMs is the next Jobs/Gates/Musk and Nobel Prize Winner (I think it’ll solve alignment by having millions of eyes on the internals of LLMs), because computers became popular only after the OS with a GUI appeared. I recently shared how one of its “apps” possibly can look: https://news.ycombinator.com/item?id=43319726
What
What what?)
Sir, this is a Wendy's.
What
Another related shocking idea: https://news.ycombinator.com/item?id=43319726
I found Gaussian Processes with the right kernel to be very powerful with even just a few data points and a very small set of parameters. I don't know if I was using it correctly tbh, but it worked out great in predicting values that I could not predict so accurately. I used it as a predictable yet non-linear process to tweak the input in a computer vision task. The proof was literally in the pudding.
Seems similar, or at least partially overlap, with what I would say is the best reference on the subject, an Introduction to Statistical Learning from Gareth James et al [1].
I wonder it this one might be a bit more accessible, although I guess the R/Python examples are helpful on the latter.
[1] https://www.statlearning.com/
not really, islr is a pretty basic book - this is about more advanced techniques to propagate probability estimates rather than point-wise
and frankly i would not recommend islr anymore today, too dated
What would you (or other informed parties) recommend?
it’s been a while since I’ve been a beginner so I might not have the best resources, but I would recommend Harvard’s Stat 110 with Joe Blitzstein (lectures online) and then Machine Learning by Kevin Murphy. might be a scarier book to someone not confident in their math, but overall a better one imo
for something more directly comparable to the niche ISLR filled, Bishop’s books are generally better - although I can’t recall their title
https://www.bishopbook.com/ is the new one
Kevin Murphy racing to rename his Probabilistic Machine Learning series.
Gemini 2.0 Experimental 02-05 sees this as "only" 107K tokens.
Handy if you want help breaking this down.
https://aistudio.google.com
'Laplace Approximation is a "quick and dirty" way to turn a complex probability distribution into a simple Gaussian (bell curve).
It works by finding the highest point (mode) and matching the curvature at that point.
It's fast and easy, but it can be very inaccurate and overconfident if the true distribution doesn't look like a bell curve.'
I took this class at ETH Zurich and it is one of my favorite classes. Especially how do you quantify uncertainty and how they build the starting blocks of reinforcement learning. I think it’s an excellent read for data scientists and ML engineers.
This document is the lecture notes.
This is great. Is it available as a printed book?
From a brief search I see that it isn't (or it least not yet), but seeing how well-formatted the pdf is, and the fact that it's CC-licensed, you could print it yourself, or perhaps talk with them to organize a batch.
Though I personally prefer to read these sorts of books directly from pdf, and am grateful to them for sharing it on arxiv.
I wonder if one could organize an arXiv print service that binds and prints and ships with a unique cover and such.
Also it should use LLMs and the blockchain.
But this would be nice there are a number of papers and such that if you could submit an arXiv link to a print service I would probably buy a copy. I wonder why no one does it.
Aren't you describing Lulu but for the very niche case of arxiv publications that are small books but not published as books? I think you could do it in a weekend with their API.
If anyone is interested in trying this, here is some Python starter code you might find useful: https://github.com/minireference/lulu-api-client?tab=readme-...
This worked four years ago when the API was still launched, but there might have been changes since, so no guarantees.
Most ArXiv PDFs are probably lulu-printable out of the box, but to make a general solution, one would probably need to do some pre-processing with ghostscript (gs), e.g. embed all fonts and flatten images (no transparency).
yes thats what i thought after the post. haha.
I wonder if they're aiming for it to be a book. Hubotter describes it on his web page as "notes on Probabilistic AI".
The "book" is accompanying studying material for the course Probabilistic AI at ETH Zurich. Essentially each chapter is the material covered in one lecture (3hrs).
Source: I did the course
I think this is where a lot of textbooks come from.
I do not think so. I am asking the author for confirmation.
There's no printed version. Btw I took this course at ETHZ last year (a course with this title and whose script is this document. Pretty nice course and pretty nice course notes, happy to see that the authors decided to share it outside of the course website now!
Probabilistic Robotics book by Thrun and co. offers a great overview of most of these concepts.
Interesting separation and distinction between noisy inputs, noisy processing and noisy chains.
As a layman in this field, I have no idea the contact it significance of this work. Can someone better informed inform us?
> be me
> open article
> "holy shit it's 400 pages"
> realize i already have a grasp on most of the material from school
> "phew"
> oh this stuff is cool, just like i remember...
> proceed to read all 400 pages
well done! :clap:
Books suck (imho). We need a new format to teach and learn this deep technical stuff. Not youtube, something interactive with exercises and engagement.
> something interactive with exercises and engagement
Books have exercises. It's your job to engage.
This book, in particular, has 3 pages of Problems per chapter. The only way to learn the math is to do all of them.
thanks. i was worried about job security for a nanosecond.
It's a wild world where "reading the documentation" or "researching a topic" has become a career superpower. I'm glad my education largely predated social media and cell phones, and that I learned to read and work problems independently. OTOH it often makes work a very lonely, taxing experience. Being a human index into documentation is a hell of a lot less fulfilling than working with people who also can read.
I urge you to rethink this perspective. All research shows that paper increases comprehension significantly over screens and even over eink. Additionally hand note taking again has a positive impact.
It's never worked for me. I own tons of books that i want to read but never will.
yeah, I mean 3Blue1Brown has done a great job .. and maybe those would be even better if you could app-ify them into something you can interact with.
Current gen of LLM programming AIs might make it less leg-work to make these
3b1b is great but if you want to do deep technical work, you’re eventually going to have to get comfortable with text as a medium
Well, there's also school.
Apparently they don't discuss language models at all.
Which is a major omission, as transformer-based language models are the most powerful available form of "probabilistic artificial intelligence". They predict a probability distribution over a token given a sequence of previous tokens.
My guess is that most of the content in the book is several years old (it's apparently based on an ETH Zurich class), despite the PDF being compiled this year, which would explain why it doesn't cover the state of the art.