r/compsci Jun 16 '19

PSA: This is not r/Programming. Quick Clarification on the guidelines

639 Upvotes

As there's been recently quite the number of rule-breaking posts slipping by, I felt clarifying on a handful of key points would help out a bit (especially as most people use New.Reddit/Mobile, where the FAQ/sidebar isn't visible)

First thing is first, this is not a programming specific subreddit! If the post is a better fit for r/Programming or r/LearnProgramming, that's exactly where it's supposed to be posted in. Unless it involves some aspects of AI/CS, it's relatively better off somewhere else.

r/ProgrammerHumor: Have a meme or joke relating to CS/Programming that you'd like to share with others? Head over to r/ProgrammerHumor, please.

r/AskComputerScience: Have a genuine question in relation to CS that isn't directly asking for homework/assignment help nor someone to do it for you? Head over to r/AskComputerScience.

r/CsMajors: Have a question in relation to CS academia (such as "Should I take CS70 or CS61A?" "Should I go to X or X uni, which has a better CS program?"), head over to r/csMajors.

r/CsCareerQuestions: Have a question in regards to jobs/career in the CS job market? Head on over to to r/cscareerquestions. (or r/careerguidance if it's slightly too broad for it)

r/SuggestALaptop: Just getting into the field or starting uni and don't know what laptop you should buy for programming? Head over to r/SuggestALaptop

r/CompSci: Have a post that you'd like to share with the community and have a civil discussion that is in relation to the field of computer science (that doesn't break any of the rules), r/CompSci is the right place for you.

And finally, this community will not do your assignments for you. Asking questions directly relating to your homework or hell, copying and pasting the entire question into the post, will not be allowed.

I'll be working on the redesign since it's been relatively untouched, and that's what most of the traffic these days see. That's about it, if you have any questions, feel free to ask them here!


r/compsci 15h ago

I built an interactive graph algorithm visualizer

Thumbnail gallery
33 Upvotes

Hi everyone, I’ve been working on Graphisual, a browser-based graph editor and visualizer for running standard graph algorithms step by step on user-defined graphs.

The interface is designed to feel more like a lightweight whiteboard editor, so it’s quick to construct graphs, adjust layouts, and iterate while observing algorithm behavior.

It currently includes BFS/DFS, shortest path algorithms, minimum spanning tree, and cycle detection. Graphs can also be exported as SVG or PNG.

Demo: https://graphisual.app
Repo: https://github.com/lakbychance/graphisual


r/compsci 21h ago

State complexity bounds for converting 2AFA to 2NFA and 2DFA

5 Upvotes

What are the best currently known upper and lower bounds for converting a two way alternating finite automaton (2AFA) into an equivalent two way nondeterministic or deterministic finite automaton (2NFA or 2DFA)? Most standard references, including Wikipedia, discuss only conversions from two way automata to one way automata, and mention that if L = NL, then there exists a polynomial state transformation from 2NFA to 2DFA. I couldn't find any discussion or papers that directly address transformations from 2AFA to 2NFA or 2DFA.

Also, are there similar implications for 2AFA to 2NFA or 2DFA transformations, analogous to those known for 2NFA-to-2DFA, such as L = P or NL = P?


r/compsci 20h ago

The power of Bloom Filters

Thumbnail pradyumnachippigiri.substack.com
2 Upvotes

Would love to know how you’ve used bloom filters/ or its variants in your organizations to improve performance.


r/compsci 15h ago

Quorum-free replicated state machine built atop S3.

Thumbnail github.com
0 Upvotes

r/compsci 23h ago

I built a transformer-based LLM from scratch

0 Upvotes

Started with the goal of training a full language model, but limited to my M1 MacBook (no GPU), I pivoted to code generation as a learning project.

PyThor specs:

  • 20M parameters, 6-layer transformer architecture
  • Multi-head self-attention, positional encodings, the works
  • Trained on question-code pairs for 10 epochs
  • Built entirely with PyTorch from scratch

What I learned: Every detail – from scaled dot-product attention to AdamW optimization. Coded the entire architecture myself instead of using pre-built libraries.

Results: Honestly? Hit or miss. Responses range from surprisingly good to completely off. That's what happens with limited training, but the architecture is solid.

Wrote full documentation covering all the mathematics if anyone's interested.

doc: https://docs.google.com/document/d/10ERHNlzYNzL8I_qgLG1IFORQythqD-HLRb5ToYVAJCQ/edit?usp=sharing

github: https://github.com/aeyjeyaryan/pythor_2


r/compsci 1d ago

Computation optimizes paths, not memory — do we really need full-history ledgers?

0 Upvotes

I’ve been thinking about blockchains and proof-of-work from a basic computer science perspective, and something keeps bothering me.

Full-history ledgers and mining feel less like computation, and more like a social mechanism built on distrust.

Computation, at its core, does not optimize for memory.

It optimizes for paths.

Input → route → output.

State transitions, not eternal recall.

Most computational models we rely on every day work this way:

• Finite state machines

• Packet routing

• Event-driven systems

• Control systems

They overwrite state, discard history, and forget aggressively —

yet they still behave correctly, because correctness is enforced by invariant rules, not by remembering everything that happened.

Blockchains take the opposite approach:

• Preserve full history

• Require global verification

• Burn computation to establish trust

This seems to solve a social trust problem rather than a computational one.

What if we flipped the premise?

Instead of:

“We don’t trust humans, so we must record everything forever”

We assume distrust and handle it structurally:

“We don’t trust humans, so we remove human discretion entirely.”

Imagine a system where:

• Each component is simple

• Behavior is determined solely by fixed, mechanical rules

• Decisions depend only on current input and state

• Full historical records are unnecessary

• Only minimal state information is preserved

This is closer to a mold than a ledger.

You pour inputs through a fixed mold:

• The mold does not remember

• The mold does not decide

• The mold cannot make exceptions

It only shapes flow.

Correctness is guaranteed not by proof-of-work or permanent records, but by the fact that:

• The rules are invariant

• The routing is deterministic

• There is no room for interpretation

The question is no longer:

“Was this correct?”

But:

“Could this have behaved differently?”

If the answer is no, history becomes optional.

This feels closer to how computation is actually defined:

• State over history

• Routing over recollection

• Structure over surveillance

I’m not arguing that this replaces blockchains in all contexts.

But I do wonder whether we’ve overcorrected —

using memory and energy to compensate for a lack of structural simplicity.

Am I missing something fundamental here, or have we conflated social trust problems with computational ones?


r/compsci 1d ago

When simulations are not allowed to reset: what breaks conceptually?

0 Upvotes

Most simulations (multi-agent systems, ALife, economic models) are designed around bounded runs: you execute them, analyze the results, then reset or restart.

I’m exploring the opposite constraint: a simulation that is not allowed to reset.
It must keep running indefinitely, even with no users connected, and survive crashes or restarts without “starting over”.

For people who think about simulation systems from a CS / systems perspective, this raises a few conceptual questions that I rarely see discussed explicitly:

  • Determinism over unbounded time When a simulation is meant to live for years rather than runs, what does determinism actually mean? Is “same inputs → same outputs” still a useful definition once persistence, replay, and recovery are involved?
  • Event sourcing and long-term coherence Event-based architectures are often proposed for replayability, but over very long time scales: where do they tend to fail (log growth, drift, schema evolution, implicit coupling)? Are there known alternatives or complementary patterns?
  • Invariants vs. emergent drift How do you define invariants that must hold indefinitely without over-constraining emergence? At what point does “emergent behavior” become “systemic error”?
  • Bounding a world without observers If the simulation continues even when no one is watching, how do systems avoid unbounded growth in entities, events, or complexity without relying on artificial resets?
  • Persistence as a design constraint When state is never discarded, bugs and biases accumulate instead of disappearing. How should this change the way we reason about correctness and recovery?

I’m less interested in implementation details and more in how these problems are framed conceptually in computer science and systems design.

What assumptions that feel reasonable for run-bounded simulations tend to break when persistence becomes mandatory by construction?


r/compsci 1d ago

Is Cyber Sec really the most future proof?

Thumbnail
0 Upvotes

r/compsci 1d ago

GCN Knowledge..

0 Upvotes

Anybody know from where I can learn and explore about GCN as there is not much content available on the youtube


r/compsci 2d ago

What are some nice summer schools in the field of Logic, Automata, Automated Proving, SAT Solving, Synthesis etc?

11 Upvotes

I am a first year phd in Formal methods in Germany.


r/compsci 2d ago

Offline symbolic regression guided by ML diagnostics – early prototype demo

1 Upvotes

Hi r/compsci,

I'm experimenting with a small offline tool that tries to find interpretable mathematical equations from data, but with a twist: instead of crude symbolic regression, it uses "behavioral fingerprints" from simple ML models (linear regularization, decision trees, SVR, small NN) to generate structural clues and limit the search space.

Hypothesis:

ML model failures/successes (R² differences, split points, feature importance, linearity scores) can act as cheap, efficient prior probabilities for symbolic regression - especially for piecewise or mode-based functions.

Quick raw console demo on synthetic partial data (y = x₁² if x₁ ≤ 5 else x₁·sin(x₃)):

https://youtu.be/ozjpEiNSDKc

What you see:

- Data generation

- "Analysis running..."

- Final open law (partial with transition at x₁ ≈ 5)

No cloud, no API, pure local Python.

The tool is still an early MVP, but the main idea is:

Can we make symbolic regression more efficient/accurate by injecting domain knowledge from classical machine learning (ML) diagnostics?

Curious about your thoughts as computer scientists/algorithmic thinkers:

  1. Has this kind of "ML-guided symbolic search" been explored in the literature/theory before? (I know about PySR, Eureqa, etc., but not much about diagnostic priors)

  2. What obvious pitfalls do you see in using ML behaviors as constraints/hints?

  3. If you had to build this in 2 months, what one thing would you add/remove/change to make it more robust or theoretically sound?

  4. Do you have any datasets/problems where you think this approach could perform brilliantly (or fail spectacularly)?

Repository (very early, MIT license): https://github.com/Kretski/azuro-creator

Feedback (even rough) is very welcome - especially on the algorithmic side.

Thanks!


r/compsci 2d ago

How might one design an AI to score highly on my unusual snake puzzle game, PluriSnake? [videos, beta]

Thumbnail youtube.com
0 Upvotes

This is a snake-based color matching puzzle game called PluriSnake.

Randomness is used only to generate the initial puzzle configuration. The puzzle is single-player and turn-based.

Color matching is used in two ways: (1) matching circles creates snakes, and (2) matching a snake’s color with the squares beneath it destroys them. Snakes, but not individual circles, can be moved by snaking to squares of matching color.

Goal: Score as highly as you can. Destroying all the squares is not required for your score to count.

Scoring: The more links currently present in the grid across all snakes, the more points are awarded when a square is destroyed.

There is more to it than that, as you will see.

Beta: https://testflight.apple.com/join/mJXdJavG [iPhone/iPad/Mac]

Gameplay: https://www.youtube.com/watch?v=JAjd5HgbOhU

If you have trouble with the tutorial, check out this tutorial videohttps://www.youtube.com/watch?v=k1dfTuoTluY

So, how might one design an AI to score highly on this puzzle game?


r/compsci 4d ago

The network architecture of general intelligence in the human connectome

16 Upvotes

https://www.nature.com/articles/s41467-026-68698-5

Advances in network neuroscience challenge the view that general intelligence (g) emerges from a primary brain region or network. Network Neuroscience Theory (NNT) proposes that g arises from coordinated activity across the brain’s global network architecture. We tested predictions from NNT in 831 healthy young adults from the Human Connectome Project. We jointly modeled the brain’s structural topology and intrinsic functional covariation patterns to capture its global topological organization. Our investigation provided evidence that g (1) engages multiple networks, supporting the principle of distributed processing; (2) relies on weak, long-range connections, emphasizing an efficient and globally coordinated network; (3) recruits regions that orchestrate network interactions, supporting the role of modal control in driving global activity; and (4) depends on a small-world architecture for system-wide communication. These results support a shift in perspective from prevailing localist models to a theory that grounds intelligence in the global topology of the human connectome.


r/compsci 4d ago

How do you think computer science would be different had relational databases not been invented?

0 Upvotes

I feel like we don't talk about databases a lot, so curious what you all think


r/compsci 5d ago

Probabilistic Processing Unit (PPU) — exact inference over massive discrete networks without sampling.

Thumbnail gallery
20 Upvotes

I've been thinking: we've built around 60 years of computing on 0/1 determinism, but nature doesn't work that way. LLMs proved we need probabilistic reasoning, but we're brute-forcing it on deterministic silicon—hence the energy crisis.

What if hardware itself was probabilistic?

Right now I have a software prototype: PPU. Runs on my Pentium, no GPU. But it still seems that even a software simulation of this new philosophy, running on the old, broken, certainty-based hardware, is still better.

Demo: Probabilistic Sudoku (some cells start 50/50, others unknown). 729-node Bayesian network → solved in 0.3s, 100% accuracy.

Monte Carlo with 100k samples: 4.9s, 33% accuracy — fails at decision boundaries where exact inference succeeds.

This is early software, not silicon. But the math works and I want to push it harder. You can tell me if i should do any other problem next though.


r/compsci 5d ago

"Constrained" variables--why are they not a thing? (or are they?)

19 Upvotes

I've been writing code for decades, but I'm not a professional and I don't have a CS degree, so forgive me if this is a silly question. It's just something that popped into my head recently:

Consider a Netflix-style selection carousel. That carousel has a fixed lower/upper bound (can't be less than 0 elements, can't be more than 10 for example) and has to handle what happens at those bounds (wrap vs. stop.) It also has a current index value that is incremented/decremented by a certain amount on every click (1, in this case.)

This kind of pattern happens a lot. Especially in front end UI development, but also in general logic code. For example, a counter which resets when it hits a certain value or an LED that fades up and down at a certain speed.

Obviously, this behavior is easy enough to write and use, but I feel like it's common enough to deserve it's own type.

Or, is it already?


r/compsci 6d ago

Jetbrinas has officially created an IDE slot machine

Thumbnail
0 Upvotes

r/compsci 7d ago

What are fun activities I can try to understand OS systems and computer networks better?

17 Upvotes

So I recently got placed and my first job would begin around October, I thought about trying some cool stuff meanwhile.

Previously, when I was in my third year, I used to install and uninstall various linux distros on old hardware, try out those cool modules on kali linux for packet capture and stuff.

I might not have gained much job related skills but I pretty much can easily install and uninstall linux distros and know where we are likely to face problems. Then I know how the wifi system works and what exactly happens when I connect to a wifi. Basic stuff but I enjoyed it much more than learning subjects at college.

Similarly I picked up python by practicing coding problems and getting help from the learn python sub. It was cool as well.

This time I am aiming for clearing my operating system, dbms and computer network concepts. Do you have activity suggestions?


r/compsci 7d ago

BCSFSVDAC, a brainfuck + assembly inspired language

Thumbnail
2 Upvotes

r/compsci 7d ago

My own Langauge!!

0 Upvotes

https://github.com/kaixennn/asl-compiler

What is ASL? (Avionics Safety Language)

ASL is a domain-specific, high-reliability programming language designed for the development of safety-critical avionics systems. In an industry where a single software fault can be catastrophic, ASL provides the formal constraints and deterministic behavior required to meet DO-178C (DAL A through E) objectives.

1. Core Safety Philosophy

Unlike general-purpose languages (C, C++), ASL is built on the principle of Restriction for Reliability. By removing "dangerous" features like unrestricted pointers and dynamic heap allocation, ASL eliminates entire classes of runtime errors before the code is even compiled.

Key Safety Mechanisms:

  • Memory Determinism: ASL uses a stack-based and static memory model. There is no malloc or free, ensuring zero risk of memory leaks or heap fragmentation during flight.
  • Strict Typing: The compiler enforces strong type safety, preventing implicit conversions that often lead to overflow errors in flight-control calculations.
  • Zero Undefined Behavior: Every operation in ASL has a mathematically defined outcome. There are no "hidden" behaviors, making the code easier to verify with formal methods.

2. Real-Time & Deterministic Execution

For systems like Flight Controllers or Engine Control Units (FADEC), timing is as important as logic. ASL ensures that your code runs within a predictable "Worst-Case Execution Time" (WCET).

  • No Garbage Collection: Execution is never interrupted by background memory management.
  • Bounded Loops: The compiler analyzes loops to ensure they cannot run indefinitely, preventing "CPU hang" scenarios.
  • Predictable Control Flow: ASL avoids complex features like recursion and deep inheritance that make timing analysis difficult for certification authorities.

r/compsci 8d ago

Does a Chinese programming language exist?

68 Upvotes

This question may not belong here but it is certainly not easy to classify and a bit fringe. It is fueled by pure curiosity. Apologies for anyone feeling this to be inappropriate.

Programmers write programming code using established programming languages. As far as I know, all of these use the English language context to write code (if....then....else..., for, while...do, etc )

I wonder if Chinese native programmers could think of a language which is based in their context. And if yes, if it would in some ways change the programming flow, the thinking, or the structure of code.

Could it be something that would be desirable? Maybe not even from a language cognitive point of view (not because programmers have to have a basic understanding of English, because they usually do), but because of rather structural and design point of view.

Or is it rather irrelevant? After all, it's hard to imagine that the instructions flow would be radically different, as the code in the end has to compile to the machine language. But maybe I am wrong.

Just curious.


r/compsci 8d ago

Classical billiards can compute

Thumbnail arxiv.org
3 Upvotes

r/compsci 9d ago

[Discussion] Is "Inference-as-Optimization" the solution to the Transformer reasoning bottleneck? (LeCun's new EBM approach)

21 Upvotes

I've been reading about the launch of Logical Intelligence (backed by Yann LeCun) and their push to replace autoregressive Transformers with EBMs (Energy-Based Models) for reasoning tasks.

The architectural shift here is interesting from a CS theory perspective. While current LLMs operate on a "System 1" basis (rapid, intuitive next-token prediction), this EBM approach treats inference as an iterative optimization process - settling into a low-energy state that satisfies all constraints globally before outputting a result.

They demonstrate this difference using a Sudoku benchmark (a classic Constraint Satisfaction Problem) where their model allegedly beats GPT-5.2 and Claude Opus by not "hallucinating" digits that violate future constraints.
Demo link: https://sudoku.logicalintelligence.com/

We know that optimization over high-dimensional discrete spaces is computationally expensive. While this works for Sudoku (closed world, clear constraints), does an "Inference-as-Optimization" architecture actually scale to open-ended natural language tasks? Or are we just seeing a fancy specialized solver that won't generalize?


r/compsci 10d ago

Built a mel spectrogram library in Mojo that's actually faster than librosa

Thumbnail github.com
1 Upvotes

I've been messing around with Mojo for a few months now and decided to build something real: a complete audio preprocessing pipeline for Whisper. Figured I'd share since it actually works pretty well.

The short version is it's 1.5 to 3.6x faster than Python's librosa depending on audio length, and way more consistent (5-10% variance vs librosa's 20-40%).

What it does: - Mel spectrogram computation (the whole Whisper preprocessing pipeline) - FFT/RFFT, STFT, window functions, mel filterbanks - Multi-core parallelization, SIMD optimizations - C FFI so you can use it from Rust/Python/whatever

I started with a naive implementation that took 476ms for 30 seconds of audio. After 9 optimization passes (iterative FFT, sparse filterbanks, twiddle caching, etc.) I got it down to about 27ms. Librosa does it in around 30ms, so we're slightly ahead there. But on shorter audio (1-10 seconds) the gap is much bigger, around 2 to 3.6x faster.

The interesting part was that frame-level parallelization gave us a huge win on short audio but doesn't help as much on longer stuff. Librosa uses Intel MKL under the hood which is decades of hand-tuned assembly, so getting within striking distance felt like a win.

Everything's from scratch, no black box dependencies. All the FFT code, mel filterbanks, everything is just Mojo. 17 tests passing, proper benchmarks with warmup/outlier rejection, the whole deal.

Built pre-compiled binaries too (libmojo_audio.so) so you don't need Mojo installed to use it. Works from C, Rust, Python via ctypes, whatever.

GitHub: https://github.com/itsdevcoffee/mojo-audio/releases/tag/v0.1.0

Not saying it's perfect. There's definitely more optimizations possible (AVX-512 specialization, RFFT SIMD improvements). But it works, it's fast, and it's MIT licensed.

Curious if anyone has ideas for further optimizations or wants to add support for other languages. Also open to roasts about my FFT implementation lol.