/r/MachineLearning - top ten submissions for each month of 2025

sfw subreddits | << MachineLearning 2024 | MachineLearning 2026 >>

2025, December

Ilya Sutskever is puzzled by the gap between AI...

465 Ilya Sutskever is puzzled by the gap between AI...

[D] Best papers of 2025

304 [D] Best papers of 2025

[P] My DC-GAN works better then ever!

286 [P] My DC-GAN works better then ever!

[D] Published paper uses hardcoded seed and col...

288 [D] Published paper uses hardcoded seed and col...

[P] Eigenvalues as models

209 [P] Eigenvalues as models

[D] r/MachineLearning - a year in review

197 [D] r/MachineLearning - a year in review

[D] On low quality reviews at ML conferences

193 [D] On low quality reviews at ML conferences

How do you as an AI/ML researcher stay current ...

151 How do you as an AI/ML researcher stay current ...

[D] How does Claude perform so well without any...

141 [D] How does Claude perform so well without any...

[D] AISTATS is Desk-Rejecting Papers Where Auth...

126 [D] AISTATS is Desk-Rejecting Papers Where Auth...

2025, November

[D] Got burned by an Apple ICLR paper — it was ...

1599 [D] Got burned by an Apple ICLR paper — it was ...

[D] Tsinghua ICLR paper withdrawn due to numero...

362 [D] Tsinghua ICLR paper withdrawn due to numero...

[R] Knowledge Graph Traversal With LLMs And Alg...

311 [R] Knowledge Graph Traversal With LLMs And Alg...

[R] LeJEPA: New Yann Lecun paper

304 [R] LeJEPA: New Yann Lecun paper

[D] Why TPUs are not as famous as GPUs

208 [D] Why TPUs are not as famous as GPUs

Reasoning models don't degrade gracefully - the...

209 Reasoning models don't degrade gracefully - the...

[D] ICLR 2026 Paper Reviews Discussion

189 [D] ICLR 2026 Paper Reviews Discussion

[D] <ICLR review comment> Is this real?

185 [D] <ICLR review comment> Is this real?

[R] Unvalidated Trust: Cross-Stage Vulnerabilit...

180 [R] Unvalidated Trust: Cross-Stage Vulnerabilit...

[D] ICLR reviewers being doxed on OpenReview

183 [D] ICLR reviewers being doxed on OpenReview

2025, October

[N] Stanford is updating their Deep Learning co...

279 [N] Stanford is updating their Deep Learning co...

[D]NLP conferences look like a scam..

268 [D]NLP conferences look like a scam..

[D] Bad Industry research gets cited and publis...

267 [D] Bad Industry research gets cited and publis...

[D] Why are Monte Carlo methods more popular th...

155 [D] Why are Monte Carlo methods more popular th...

[R] DeepSeek 3.2's sparse attention mechanism

144 [R] DeepSeek 3.2's sparse attention mechanism

[R] PKBoost: Gradient boosting that stays accur...

142 [R] PKBoost: Gradient boosting that stays accur...

[P] Adapting Karpathy’s baby GPT into a charact...

133 [P] Adapting Karpathy’s baby GPT into a charact...

[D] Need career advice, just got rejected for a...

128 [D] Need career advice, just got rejected for a...

Google PhD Fellowship recipients 2025 [D]

123 Google PhD Fellowship recipients 2025 [D]

[D] Open source projects to contribute to as an...

115 [D] Open source projects to contribute to as an...

2025, September

[D] Is senior ML engineering just API calls now?

396 [D] Is senior ML engineering just API calls now?

[D]: How do you actually land a research scient...

192 [D]: How do you actually land a research scient...

[D] which papers HAVEN'T stood the test of time?

179 [D] which papers HAVEN'T stood the test of time?

[D] NeurIPS: rejecting papers from sanctioned a...

141 [D] NeurIPS: rejecting papers from sanctioned a...

Why Language Models Hallucinate - OpenAi pseudo...

123 Why Language Models Hallucinate - OpenAi pseudo...

[D] Do you ever miss PyTorch-style workflows?

104 [D] Do you ever miss PyTorch-style workflows?

[R] DynaMix: First dynamical systems foundation...

99 [R] DynaMix: First dynamical systems foundation...

[D] How about we review the reviewers?

92 [D] How about we review the reviewers?

[D] NeurIPS should start a journal track.

90 [D] NeurIPS should start a journal track.

[P] Built a differentiable parametric curves li...

84 [P] Built a differentiable parametric curves li...

2025, August

[R] Position: The Current AI Conference Model i...

393 [R] Position: The Current AI Conference Model i...

[D] How do researchers ACTUALLY write code?

163 [D] How do researchers ACTUALLY write code?

[D] How did JAX fare in the post transformer wo...

149 [D] How did JAX fare in the post transformer wo...

[R] ?APT: critical review aimed at maximizing c...

119 [R] ?APT: critical review aimed at maximizing c...

[P] From GPT-2 to gpt-oss: Analyzing the Archit...

100 [P] From GPT-2 to gpt-oss: Analyzing the Archit...

[D] Have any Bayesian deep learning methods ach...

95 [D] Have any Bayesian deep learning methods ach...

[D] People in ML/DS/AI field since 5-10 years o...

94 [D] People in ML/DS/AI field since 5-10 years o...

I built a tool to benchmark tokenizers across 1...

86 I built a tool to benchmark tokenizers across 1...

[D] PhD vs startup/industry for doing impactful...

72 [D] PhD vs startup/industry for doing impactful...

[R] I’ve read the ASI?Arch paper — AI discovere...

69 [R] I’ve read the ASI?Arch paper — AI discovere...

2025, July

[R] NeuralOS: a generative OS entirely powered ...

585 [R] NeuralOS: a generative OS entirely powered ...

[D] Gemini officially achieves gold-medal stand...

232 [D] Gemini officially achieves gold-medal stand...

Favorite ML paper of 2024? [D]

180 Favorite ML paper of 2024? [D]

[D] AI/ML interviews being more like SWE interv...

142 [D] AI/ML interviews being more like SWE interv...

[P] Understanding Muon: A Revolutionary Neural ...

128 [P] Understanding Muon: A Revolutionary Neural ...

[D] Position: Machine Learning Conferences Shou...

106 [D] Position: Machine Learning Conferences Shou...

[D] How will LLM companies deal with CloudFlare...

99 [D] How will LLM companies deal with CloudFlare...

[D] What resources would Theoretical ML researc...

90 [D] What resources would Theoretical ML researc...

[D] Views on DIfferentiable Physics

77 [D] Views on DIfferentiable Physics

[P] I tried implementing the CRISP paper from G...

75 [P] I tried implementing the CRISP paper from G...

2025, June

[D] Machine Learning, like many other popular f...

373 [D] Machine Learning, like many other popular f...

[P] Interactive Pytorch visualization package t...

286 [P] Interactive Pytorch visualization package t...

[P]: I reimplemented all of frontier deep learn...

242 [P]: I reimplemented all of frontier deep learn...

[R] LLMs are Locally Linear Mappings: Qwen 3, G...

238 [R] LLMs are Locally Linear Mappings: Qwen 3, G...

[D] What underrated ML techniques are better th...

194 [D] What underrated ML techniques are better th...

[D] Burned out mid-PhD: Is it worth pushing thr...

177 [D] Burned out mid-PhD: Is it worth pushing thr...

I'm not obsolete, am I? [P]

151 I'm not obsolete, am I? [P]

[R] Log-Linear Attention

128 [R] Log-Linear Attention

[D] Are GNNs/GCNs dead ?

103 [D] Are GNNs/GCNs dead ?

[D] The effectiveness of single latent paramete...

91 [D] The effectiveness of single latent paramete...

2025, May

[D] What Yann LeCun means here?

436 [D] What Yann LeCun means here?

[D] Google already out with a Text- Diffusion M...

270 [D] Google already out with a Text- Diffusion M...

[D] Has a research field ever been as saturated...

241 [D] Has a research field ever been as saturated...

[D] Overleaf is down?

191 [D] Overleaf is down?

[R] AlphaEvolve: A coding agent for scientific ...

149 [R] AlphaEvolve: A coding agent for scientific ...

[D] Why is RL in the real-world so hard?

141 [D] Why is RL in the real-world so hard?

Absolute Zero: Reinforced Self-play Reasoning w...

123 Absolute Zero: Reinforced Self-play Reasoning w...

[D] How do students have so many top tier confe...

102 [D] How do students have so many top tier confe...

[R] Leaderboard Hacking

97 [R] Leaderboard Hacking

[R] Meta releases synthetic data kit!!

95 [R] Meta releases synthetic data kit!!

2025, April

arXiv moving from Cornell servers to Google Cloud

263 arXiv moving from Cornell servers to Google Cloud

[R] Implemented 18 RL Algorithms in a Simpler Way

156 [R] Implemented 18 RL Algorithms in a Simpler Way

[R] Beyond-NanoGPT: Go From LLM Noob to AI Rese...

142 [R] Beyond-NanoGPT: Go From LLM Noob to AI Rese...

[P] I made a bug-finding agent that knows your ...

129 [P] I made a bug-finding agent that knows your ...

[D] ICML 2025: A Shift Toward Correctness Over ...

126 [D] ICML 2025: A Shift Toward Correctness Over ...

[D] A very nice blog post from Sander Dielman o...

123 [D] A very nice blog post from Sander Dielman o...

[R] One Embedding to Rule Them All

117 [R] One Embedding to Rule Them All

[R] Neuron Alignment Isn’t Fundamental — It’s a...

113 [R] Neuron Alignment Isn’t Fundamental — It’s a...

[R] Proof or Bluff? Evaluating LLMs on 2025 USA...

108 [R] Proof or Bluff? Evaluating LLMs on 2025 USA...

[D] When will reasoning models hit a wall?

98 [D] When will reasoning models hit a wall?

2025, March

Andrew Barto and Richard Sutton are the recipie...

424 Andrew Barto and Richard Sutton are the recipie...

[Research]Can AI remember irreversibly, like a ...

264 [Research]Can AI remember irreversibly, like a ...

[R] 34.75% on ARC without pretraining

240 [R] 34.75% on ARC without pretraining

[P] I'm starting a GPU mini-grant

186 [P] I'm starting a GPU mini-grant

[P] I made weightgain – an easy way to train an...

151 [P] I made weightgain – an easy way to train an...

Gemma 3 released: beats Deepseek v3 in the Aren...

134 Gemma 3 released: beats Deepseek v3 in the Aren...

[D] Math in ML Papers

101 [D] Math in ML Papers

[R] Had a paper accepted at CVPR, should I put ...

100 [R] Had a paper accepted at CVPR, should I put ...

[D] Importance of C++ for Deep Learning

100 [D] Importance of C++ for Deep Learning

[R] How to start writting papers as an independ...

94 [R] How to start writting papers as an independ...

2025, February

[D] Which software tools do researchers use to ...

618 [D] Which software tools do researchers use to ...

[D] How you do ML research from scratch?

277 [D] How you do ML research from scratch?

[D] Why mamba disappeared?

181 [D] Why mamba disappeared?

[R] LIMO: Less is More for Reasoning

168 [R] LIMO: Less is More for Reasoning

[D] CVPR 2025 Final Decision

165 [D] CVPR 2025 Final Decision

[R] reasoning models are indecisive parrots

161 [R] reasoning models are indecisive parrots

[D] We built GenAI at Google and Apple, then le...

161 [D] We built GenAI at Google and Apple, then le...

[D] Fine-tuning is making big money—how?

153 [D] Fine-tuning is making big money—how?

[R] "o3 achieves a gold medal at the 2024 IOI a...

144 [R] "o3 achieves a gold medal at the 2024 IOI a...

[D] What are current UNPOPULAR research topics ...

114 [D] What are current UNPOPULAR research topics ...

2025, January

[P] Built a Snake game with a Diffusion model a...

537 [P] Built a Snake game with a Diffusion model a...

[d] Why is "knowledge distillation" now suddenl...

439 [d] Why is "knowledge distillation" now suddenl...

[D]: A 3blue1brown Video that Explains Attentio...

387 [D]: A 3blue1brown Video that Explains Attentio...

[P] How I found & fixed 4 bugs in Microsoft...

317 [P] How I found & fixed 4 bugs in Microsoft...

[D] I hate softmax

267 [D] I hate softmax

[D] Have transformers won in Computer Vision?

188 [D] Have transformers won in Computer Vision?

[D] Ran Deepseek R1 32B Locally

178 [D] Ran Deepseek R1 32B Locally

[P] Building an Reinforcement Learning Agent to...

165 [P] Building an Reinforcement Learning Agent to...

[D] Misinformation about LLMs

143 [D] Misinformation about LLMs

Grokking at the Edge of Numerical Stability [Re...

137 Grokking at the Edge of Numerical Stability [Re...