Got it. Here’s the “project-level” design (still high-level, but with your technical choices and the AI piece called out clearly).
Project goals with your technical choices baked in
Guiding constraints
• Django + templates + HTMX as the primary UI model (no SPA).
• CodeMirror as an “editor island” for Markdown.^codemirror
• Markdown is the internal format everywhere (notes, chunks, explanations, prompts) to keep import/export trivial.
• PostgreSQL from day 1 because search is a core feature, not a nice-to-have.
• Background work must exist early (OCR/transcripts/indexing/summarization).
⸻
The four main product areas
- Course content hub
Goal: capture all course material in one place and make it searchable + referenceable.
What “done” looks like: • You can attach/organize: • slides/PDFs, målbeskrivning, instudieringsfrågor • lecture resources + your own lecture notes • YouTube links (+ subtitles/transcripts) • Everything becomes text (OCR/transcripts) and is searchable via PG FTS. • Every processed/derived thing points back to origin (page number, timestamp, file).
Key UX goal: • “I remember something was mentioned somewhere” → search finds it fast → open at the exact page/timestamp.
⸻
- Incremental writing (SuperMemo-style) on top of Markdown
Goal: turn raw material into durable knowledge in small steps.
What “done” looks like: • You can extract passages from sources into your Markdown note system as stable chunks. • Each chunk has a lifecycle: • extracted → rewritten → distilled → prompted → maintained • When you fail a review or miss a tentamen question, the system routes you back to the exact chunk/source context to improve the text/prompt (the “SM loop”).
Key UX goal: • You never rewrite the whole note; you process a queue of small “next actions”.
⸻
- Old exams (tentamen) training
Goal: train the exact skill that matters: solving the course’s exam-style questions.
What “done” looks like: • Import exam material (from DISA exports, PDFs, scans). • Run exam sessions (timed/untimed), log: • correctness, time spent, confidence, common wrong option, etc. • Link questions to: • concepts/chunks • source locations (slides/pages) that explain them
Key UX goal: • After an attempt, you get a short list of specific fixes: which concept/chunk to improve, which sources to revisit.
⸻
- Spaced repetition (FSRS) across both concepts and exams
Goal: one review system for everything that’s worth remembering.
What “done” looks like: • FSRS schedules: • knowledge prompts derived from chunks (conceptual memory) • exam questions (performance memory) • A unified daily queue mixes: • concept prompts (depth) • exam drilling (transfer / exam performance)
Key UX goal: • Your daily queue reflects both “what you need to understand” and “what you need to score”.
⸻
AI integration goals (the missing piece)
You want AI as an operator inside the workflows, not a separate chatbot page.
AI should be able to do (first-class features) 1. OCR improvement / cleanup • de-noise OCR text, fix hyphenation, headings, tables “good enough” 2. Summarization at multiple granularities • per page/segment, per lecture, per section • generate: key points, misconceptions, “exam traps”, candidate prompts 3. Teacher substitute Q&A grounded in your sources • ask a question and get an answer that cites your sources (page/timestamp) 4. Validation of free-text answers • you type an answer to an exam question / prompt • AI grades it against rubric + your sources, and suggests what’s missing
Non-negotiables for AI in this system • Provenance: AI answers must cite the exact source chunks they used (your extracted text units). • Separation of concerns: AI generates suggestions; your stored knowledge remains Markdown artifacts you control. • Repeatability: store prompt/version/meta for generated outputs so you can regenerate or compare later.
AI architecture choice (still high-level) • A provider-agnostic “AI service layer” with capabilities: • summarize(source_segment) • extract_concepts(source_segment) • generate_prompts(chunk) • answer_question(query, retrieved_context) • grade_free_text(answer, rubric, context) • Retrieval should be PG-driven (FTS + ranking) and the AI is fed only the top-k source segments (RAG).
⸻
Background tasks: important correction about Django 6 tasks
Django 6 adds a built-in Tasks framework (definition/queueing/results), but it does not ship a production-ready worker/backend that executes tasks. 
So your project goal should be: • use Django’s Tasks API for a standard interface • pick an execution mechanism: • simplest: django-tasks reference implementation (community)  • or keep Celery/RQ if you already like them
Either way, the system must run these jobs out-of-request: • OCR, transcript fetching, indexing, summarization, embedding generation (if you add it), exam parsing.
⸻
Success criteria (project-level) • You can ingest a course’s full material set in a day and search it effectively. • Your incremental queue (“next chunks to refine / prompt”) always has the right next step. • Exam attempts automatically generate a focused remediation plan. • Your FSRS daily queue stays small but high-yield. • AI outputs are grounded and traceable (no “hand-wavy teacher”).
⸻
If you want the next doc to be even more “project charter style”, I can produce: • a phased roadmap (MVP → V1 → V2) • success metrics per phase (e.g., “time-to-find”, “review load”, “exam delta”) • and a list of “risky assumptions” (OCR quality, exam parsing variability, answer keys availability, etc.).