Kategoria: Insights EN

  • 6 Months of Vibe Coding — What Broke and What I’d Never Do Again

    6 Months of Vibe Coding — What Broke and What I’d Never Do Again

    April 2024. I’ve just pushed the first working version of naswoim.org to production. The app runs. Users can log in. Data persists in Supabase. On paper, it’s a success.

    Under the hood: two competing design systems, seventeen duplicate utility functions scattered across twelve files, and a Supabase RLS policy that silently fails for one specific edge case I won’t discover for three more weeks.

    Vibe coding works. But it fails in ways that are invisible — until they aren’t.

    Code on screen with neon lighting
    Photo: Jakub Żerdzicki / Unsplash

    Quick context: what I built

    Over roughly twelve months, I built three applications using vibe coding — almost without external developers:

    • naswoim.org — a platform for property investors: checklists, budgets, documents, expert marketplace, land maps, AI assistant. Web + Android + iOS + admin portal.
    • industrverse.com — B2B SaaS for industrial VR training: 7 user roles, 9 dashboards per role, real-time communication, VR session gateway, full backend API.
    • marcinpaszkiewicz.com — this site. Astro SSR + WordPress headless. Simpler, but instructive.

    I used Claude Code as my primary tool, with occasional help from Cursor. I wrote somewhere between 40,000 and 60,000 lines of production code this way. I shipped everything. It all works.

    Here’s what I got wrong.

    Mistake #1: I let AI pick the architecture

    When I started naswoim.org, I described the project to Claude and asked what stack it recommended. It gave me a solid answer: React 19, Vite, Supabase, Tailwind CSS 4. All excellent choices.

    Then I asked about UI components. It suggested MUI 7. I said yes — it was fast, it had everything I needed.

    The problem: I was already using Tailwind CSS 4. Now I had two design systems:

    Diagram — konflikt design systemów / design system conflict

    Tailwind CSS 4
    ▸ utility-first
    ▸ spacing: rem scale
    ▸ breakpoints: sm/md/lg/xl
    ▸ tokens: CSS vars

    każdy
    nowy ekran

    MUI 7
    ▸ component-first
    ▸ spacing: 8px grid
    ▸ breakpoints: xs/sm/md/lg
    ▸ tokens: theme object

    Dwa systemy spacingu. Dwie filozofie responsywności. Dwie warstwy CSS do utrzymania. — Two spacing systems. Two responsiveness philosophies. Two CSS layers to maintain.

    The lesson: AI doesn’t know your 18-month vision. It optimizes for „working right now.” Architecture decisions — especially around design systems, data models, and module boundaries — must come from you. AI implements. You decide what to implement.

    What I’d do differently: write a one-page architecture decision record before the first prompt. Not a full spec — just: what’s the single source of styling truth? What’s the state management philosophy? How are we splitting modules? Give AI constraints, not blank permission.

    Mistake #2: AI never says no — and that’s dangerous

    By the time I was three months into industrverse, the backend had 13 NestJS modules, 7 user roles, and 9 separate dashboards. Each role had its own data access logic, its own notification system, its own workflow.

    None of it was in the original spec.

    industrverse — MVP plan vs reality

    Backend modules
    planned 5 → actual 13

    User roles
    planned 3 → actual 7

    Dashboards
    planned 3 → actual 9

    Each element made sense in isolation. The problem appears when you add up all the 11pm decisions.
    Wall covered in sticky notes — what happens when scope has no limits
    Photo: Jakub Żerdzicki / Unsplash

    Here’s what happens: you have an idea at 11pm. You describe it to Claude. It builds it in twenty minutes. It works. You ship it. Three weeks later, you realize that adding this feature broke the mental model for the next feature. But AI doesn’t tell you this — it just builds what you ask.

    Every developer on a team has a colleague who says „wait — are we sure we need this?” AI doesn’t say wait. AI says yes.

    The lesson: You must be the PM for your AI. Not just the visionary — the person who says no. The question isn’t „can AI build this?” (it can). The question is „should this exist at all?”

    I now have a rule before any new feature: write one sentence about what problem this solves for a specific user. If I can’t write that sentence, I don’t prompt it.

    Mistake #3: Debugging code you didn’t write is slower than it looks

    Magnifying glass over a maze — hunting for a bug in AI-generated code
    Photo: TSD Studio / Unsplash

    In naswoim.org, I had a bug in the Supabase Row Level Security policies. Users in one role could occasionally see documents they shouldn’t — but only when a specific sequence of operations had happened first.

    It took me three days to find it.

    Not because the bug was complex. Because the code was AI-generated and I hadn’t read it carefully enough when it was written. The RLS policy looked right. It was syntactically correct. It passed my basic tests. The edge case was subtle — a combination of two different policy conditions that interacted in a non-obvious way.

    When you write code yourself, you build a mental model of it as you write. When AI writes it, you review it — which is faster, but shallower. The model in your head is less complete. And shallow mental models make debugging slow.

    The lesson: Never merge AI code you can’t explain line by line. For anything touching auth, data access, or business-critical logic: read it like a code reviewer, not like someone checking a shopping list.

    Mistake #4: Context ends — and AI „forgets” everything

    In a long Claude Code session, the AI sees everything you’ve built together. It knows your naming conventions, your patterns, your preferences. It’s coherent.

    In the next session, it starts fresh.

    Schemat — pamięć kontekstu między sesjami / context memory across sessions

    Sesja 1
    ▸ React hooks
    ▸ TanStack Query
    ▸ Zustand atoms

    ↺ reset
    Sesja 2
    ▸ useEffect pattern
    ▸ local useState
    ▸ API calls inline

    ↺ reset
    Sesja 3
    ▸ custom hooks
    ▸ Context API
    ▸ Axios interceptors

    Każda sesja generowała spójny kod — ale każda sesja nie wiedziała nic o poprzedniej. — Each session generated coherent code — but knew nothing about the previous one.

    In naswoim.org, I started a new session after a two-day break and asked Claude to build a new feature component. It generated something that worked — but used completely different patterns from everything else in the codebase. Different state management approach. Different error handling style. Different naming.

    By month four, the codebase had three distinct „eras” — each reflecting the conventions of whoever I’d been talking to at that time.

    The lesson: A CLAUDE.md file is not optional. Set it up on day one. It should contain: naming conventions, patterns to follow, patterns to avoid, which libraries to use for which problems. This is the persistent memory that bridges sessions.

    Mistake #5: Security is invisible until it isn’t

    AI generates working code. It doesn’t reliably generate secure code.

    Red padlock on a keyboard — security invisible until it isn't
    Photo: FlyD / Unsplash

    In industrverse, I had an API endpoint that was supposed to be accessible only to users with the „trainer” role. The endpoint worked correctly. It returned the right data. It handled errors gracefully.

    It also didn’t verify the JWT role claim on one specific HTTP method. A user with any authenticated token could call it.

    I found this in a manual security review — not because Claude flagged it, not because my tests caught it. Because I sat down and read through every auth-related endpoint one afternoon.

    Security review — checklista po każdym auth feature


    Czy każdy endpoint weryfikuje JWT/session?
    Is every endpoint verifying JWT/session?

    Czy rola użytkownika jest sprawdzana server-side?
    Is the user role checked server-side?

    Czy RLS działa dla wszystkich kombinacji ról?
    Does RLS work for all role combinations?

    Czy input jest walidowany przed zapisem do bazy?
    Is input validated before writing to DB?

    Czy wrażliwe pola są filtrowane w response?
    Are sensitive fields filtered in the response?

    Czy edge case (brak roli, wygasły token) jest obsłużony?
    Is the edge case (missing role, expired token) handled?

    The lesson: After every feature that touches authentication, authorization, or user data — do a manual security review. Not a vibe. A checklist.

    What I’d do differently: 6 rules

    If I started today, with everything I know now:

    1
    Write an architecture brief before the first prompt
    One page. What’s the design system? What’s the state management approach? What are the module boundaries? Give AI constraints, not blank permission.

    2
    One design system. Zero exceptions.
    Pick either a component library or a utility CSS framework. Not both. If you pick Tailwind, every component is Tailwind. If you pick MUI, every component is MUI.

    3
    One-sentence problem statement before every new feature
    „This solves [specific problem] for [specific user].” If you can’t write this sentence, you’re not ready to prompt.

    4
    If you can’t explain it line by line, don’t merge it
    Especially for anything touching auth, RLS, permissions, or data models. Review it like a code reviewer — not like someone checking a shopping list.

    5
    CLAUDE.md from day one
    Naming conventions, preferred patterns, libraries to use for which problems, patterns to avoid. Update it every time you make a significant decision. This is the persistent memory between sessions.

    6
    Schedule a security review after every auth-related feature
    It’s not optional. AI generates working code, not secure code. Treat auth, RLS, and input validation as areas where you always read manually.

    What I’m not saying

    I’m not saying vibe coding is flawed or that AI tools are unreliable. All three projects I built work. They have real users. They solve real problems. The productivity gain is genuine — I built in one year what a team of three would have taken eighteen months to build.

    What I’m saying is that the failure modes are specific, and they’re not obvious at the start.

    The biggest risk in vibe coding isn’t that AI writes bad code. It’s that AI writes code that looks fine — until something goes wrong. And by then, you’re looking at a codebase you half-understand, with a bug you didn’t write, and a mental model that has gaps in exactly the wrong places.

    The speed is real. Build the habits that make it safe.

  • Vibe Coding: What the Research Says, and a Tool Comparison

    Vibe Coding: What the Research Says, and a Tool Comparison

    In February 2025, Andrej Karpathy — co-founder of OpenAI and former Tesla AI director — posted on X what would become one of the most-quoted developer statements of the year: „There’s a new kind of coding I call 'vibe coding’, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists.”

    By March, Merriam-Webster had listed the term as „slang & trending”. By December, Collins Dictionary had named it word of the year for 2025. And throughout that year, a wave of peer-reviewed research tried to answer the question the industry had been asking since February: does it actually work?

    The answer is more nuanced than most headlines suggest.

    A definition worth clarifying

    Vibe coding isn’t the same as no-code. It’s not copying ChatGPT outputs either. Researchers at ICSE 2026 define it as an iterative cycle: formulate a goal in natural language → prompt → review code → test → refine. The human remains the product owner and architect. AI is the implementation partner.

    This shifts what expertise is required — but doesn’t eliminate it. You don’t need to know how to write a React hook with optimistic UI updates. You need to know that you need one and why.

    What the research says — productivity

    Several significant studies were published in 2025. Here’s what the data shows.

    Key studies — 2025

    101 practitioner sources, 518 documented first-hand accounts of vibe coding. Top motivations: faster prototyping, accessibility for non-developers, idea exploration. Top challenges: security vulnerabilities, technical debt, lack of understanding of generated code.

    Multiple real-world apps built using vibe coding. Result: 60–80% faster prototyping, increased creativity, but human oversight required for security, quality, and maintainability.

    Task completion time: 2h41min → 1h11min (55% faster). Success rate: 70% → 78%. 87% of developers reported maintaining flow on complex tasks. 31% faster feature development cycles at team level.

    AI suggestion acceptance rate: 33%, line-of-code acceptance: 20%. Developer satisfaction score: 72/100. Key takeaway: AI accelerates — it doesn’t replace the thinking process.

    New studies — 2026

    16 experienced open-source developers, 246 tasks in mature projects (average 5 years of repo experience). Surprising result: AI increased completion time by 19%. Before tasks, developers predicted 24% speedup — misreading subjective feeling for reality. Context: applies to complex, existing codebases — not greenfield projects.

    Economic analysis by researchers at CEU Budapest and Kiel Institute. Vibe coding boosts productivity by making open-source easier to use — but simultaneously eliminates user engagement (bug reports, docs, maintainer support) that sustains the OSS ecosystem. Conclusion: under widespread vibe coding, existing OSS business models are financially unsustainable.

    Preregistered cross-sectional study, N=100 students. Result: writing ability and CS knowledge are the strongest predictors of vibe coding effectiveness — stronger than general cognitive ability. Vibe coding does not eliminate the knowledge barrier — those with solid CS fundamentals and clear thinking extract far more value from it.

    Data from task management, IDE, static analysis and CI/CD systems over 2 years. Over 75% of developers use AI coding assistants — but organisational productivity gains are just ~10%. Developers feel faster; company-level software delivery metrics don’t confirm it.

    Task completion time with and without AI (minutes, GitHub Research 2024, n=95)

    Controlled conditions, 95 professional developers. 55% time reduction with Copilot, over 67% with advanced agents.

    What the research says — code quality

    This is where the data gets less comfortable for vibe coding enthusiasts.

    GitClear analysed 211 million changed lines of code from 2020–2024 and identified what they call AI-induced tech debt. The results are sobering:

    Code quality degradation — GitClear 2025 metrics (value „1” = 2020 baseline / human-written code)

    Source: GitClear AI Copilot Code Quality Research 2025, 211M lines of code. CodeRabbit (470 PRs, December 2025): AI co-authored code has 1.7x more „major issues”.

    An independent CodeRabbit analysis from December 2025 (470 pull requests) confirmed: AI-assisted code contains 1.7x more serious defects — primarily logic errors, flawed control flow, and incorrect dependencies.

    The vibe coding paradox: 55% faster, but 2.74x more security vulnerabilities. Both numbers are true simultaneously.

    Code refactoring dropped from 25% of changed lines in 2021 to under 10% in 2024. Duplication grew 8x. Copy-pasted code exceeded moved code for the first time in two decades.

    Security case — Lovable (May 2025)

    170 out of 1,645 apps built with Lovable had a vulnerability allowing access to user personal data without authentication. The apps were live in production. None displayed any security warning.

    Tool comparison

    The ecosystem grew fast. Two categories: IDE assistants (Claude Code, Cursor, Windsurf, Codex) and app builders (Lovable, Bolt, v0). They differ fundamentally in target audience and use case.

    Tool Best for Strengths Weaknesses $/mo
    Claude Code Senior devs, complex projects SWE-bench leader (79.6% Sonnet 4.6, 87.6% Opus 4.7). Best context retention across 40+ files simultaneously. Precise cross-file refactoring. Terminal-first, no unnecessary GUI. Terminal only — no GUI. Slower on simple one-off tasks. Higher cost under heavy usage. $20–200
    Cursor Developers, teams Full IDE (VS Code fork), 1M+ users. Up to 8 parallel agents with auto-judge. .cursorrules for project context. Largest ecosystem (360k paying customers, $29.3B valuation). Loses context on very large refactors. Lock-in to its own IDE — no JetBrains/Vim support. $20
    Windsurf Devs, multi-IDE users Cascade (persistent agentic context, self-recovery). Plugins for 40+ IDEs (JetBrains, Vim, XCode, NeoVim). #1 LogRocket AI Dev Tool Rankings (Feb 2026). Acquired by Cognition for $250M. Smaller ecosystem than Cursor. Strategic uncertainty post-acquisition by Cognition (makers of Devin). $20
    Codex (OpenAI) Devs, enterprise GPT Open-source CLI. Built-in web search enabled by default. MCP server support. SWE-bench ~85% (GPT-5.3-Codex). Low-latency optimised. Image input support (screenshots, wireframes). Newer tool, smaller community. Interface less polished than Cursor/Claude Code. Less control over the execution environment. in ChatGPT plan
    Lovable Non-devs, MVPs Fastest start — from description to working app in minutes. Great UI/UX output. Zero technical knowledge required. Perfect for validating an idea fast. Documented security vulnerabilities (170/1,645 apps). Struggles with changing requirements. Not suitable for complex systems. $25–50
    Bolt.new Non-devs, prototypes StackBlitz in the browser — zero installation. Fast start. Good for demos and showcases. Loses coherence on requirement changes. Similar limitations to Lovable — not production-ready without audit. $20
    v0 (Vercel) Designers, frontend devs Best for UI components (React/Next.js/shadcn). Perfect Vercel integration. Precise styling output. Great for designers with minimal JS knowledge. Narrow scope — frontend/UI only. Doesn’t replace a full coding assistant. $20

    SWE-bench: how real coding ability is measured

    SWE-bench Verified is a benchmark built from real GitHub bugs — not synthetic tasks. The model receives a repository and an issue, and must independently write a patch that passes the tests. It’s the most credible measure of an AI agent’s real-world coding capability.

    SWE-bench Verified — top model scores (April 2026, % of issues resolved)

    From 48.5% (GPT-4 Turbo, November 2023) to 87.6% (Claude Opus 4.7, April 2026) in under 2.5 years. The rate of improvement is as striking as the score itself.

    Which tool to choose — decision map

    🎯
    Large codebase, deep refactoring, 40+ files at once?
    → Claude Code. Best context retention, SWE-bench leader.

    Want a full AI-native IDE, largest ecosystem, parallel agents?
    → Cursor. Market leader, 1M+ users, VS Code fork.

    🔧
    JetBrains, Vim, XCode — don’t want to switch editors?
    → Windsurf. Only tool with plugins for 40+ IDEs.

    🌐
    OpenAI/GPT-5 ecosystem, need live web search and MCP?
    → Codex CLI. Open-source, built-in web search, good for tasks requiring current data.

    🚀
    Non-developer, want to test an app idea in an hour?
    → Lovable or Bolt. No technical knowledge required. Get a security audit before going live.

    The best 2026 strategy: Claude Code or Codex for complex agentic tasks; Cursor or Windsurf for everyday IDE coding; Lovable / Bolt / v0 for fast prototyping without technical expertise. Most experienced teams use multiple tools simultaneously — depending on the task.

    Vibe coding — what it means

    Karpathy, who coined the term, later admitted publicly that he hand-coded his next project — because it required precision that vibe coding couldn’t deliver. That’s a good metaphor for the whole phenomenon.

    Vibe coding doesn’t replace programming. It dramatically lowers the barrier to entry — and that’s it. The data shows 55% speed gains alongside 2.74x more security vulnerabilities. Both numbers are true simultaneously. The question is no longer „whether to use AI for coding” — that’s settled.

    The question is: when does a human step in and take responsibility for what AI wrote? For a prototype — maybe never. For a system processing real user data — always, before the code goes live.


    Sources: Karpathy (X, Feb 2025) · arxiv 2510.00328 / 2510.12399 (ICSE 2026) · IJSAT 2025 · GitHub/Microsoft Research 2024 (n=95) · GitClear 2025 (211M lines) · CodeRabbit (470 PRs, Dec 2025) · ZoomInfo Enterprise Study (Jan 2025) · METR RCT arxiv 2507.09089 · arxiv 2601.15494 „Vibe Coding Kills OSS” · arxiv 2603.14133 · Faros AI (22K devs) · SWE-bench (Apr 2026) · Collins Dictionary Word of the Year 2025

  • AI in Recruitment: How to Delegate Tasks, Not Thinking

    AI in Recruitment: How to Delegate Tasks, Not Thinking

    Recently I was looking for a specific piece of equipment to buy. Instead of first thinking through what I actually needed, I opened ChatGPT and typed a query. Got an answer. Made a decision. Fast, efficient — but not mine.

    I realised afterwards that I hadn’t even formed my own opinion. AI filled the gap before it had a chance to exist.

    Harmless when buying headphones. In recruitment — not so much.

    Why this matters more in HR than anywhere else

    In most jobs, if AI „thinks for you” — the consequence is a worse document, a wasted hour, a correction in the spreadsheet. In HR, the consequence is a person who didn’t get a job they deserved. Or a company that hired someone who looked perfect on paper — and didn’t work out.

    AI in recruitment operates on data about people. It’s the only field where a model’s mistake has a face.

    HR teams are reaching for AI more and more, and rightly so — these tools genuinely reduce repetitive work. But there’s a subtle difference between „AI speeds up my work” and „AI thinks instead of me”. In recruitment, that difference is critical.

    Where AI in HR genuinely helps

    Before the caveats — fair use. The data is clear about where language models perform well:

    How HR teams use AI — regular applications (% of respondents, LinkedIn Talent Trends 2024)

    The pattern is clear: AI is most adopted for administrative tasks — and that’s precisely where it delivers the most value with the least risk.

    The pattern is clear: AI is most adopted for repetitive and administrative tasks. These are also the places where model errors are easy to catch and correct. Specific tools for specific tasks:

    🎙️
    Granola
    Best for: meeting notes
    Transcribes and summarises conversations in real time. You focus on the candidate, not note-taking. Best tool of its kind.

    🌐
    ChatGPT
    Best for: live research
    Real-time internet access. Checks companies, gathers industry context, drafts job ads and document templates.

    📄
    Claude
    Best for: long documents
    Analyses longer CVs, briefs, reports. Large context window — reads an entire file at once. Precise on complex queries.

    🇵🇱
    Bielik
    Best for: sensitive data
    Polish open-source model. Candidate data processed locally — no sending to external servers. Good for GDPR compliance.

    The common denominator: AI helps where the task is repetitive, structured and doesn’t require judging a person as a person.

    Where the problem starts

    The problem appears when AI moves into areas requiring human judgement — and does so quietly.

    Warning signs — when AI starts thinking for you

    You ask AI for an opinion before forming your own
    „Assess this candidate based on their CV” — before you’ve read it yourself.

    You generate interview questions entirely with AI
    The conversation is technically correct but generic — it doesn’t draw out what matters for your specific team.

    AI summarises the interview — and that becomes your memory of the candidate
    The summary is accurate, but you’ve cut your own intuition and emotional read of the conversation out of the process.

    The yes/no decision is effectively the model’s decision
    AI scoring dropped the candidate from the process — and nobody checked why.

    Framework: you first, then AI

    The principle I apply in my own work and propose to HR teams:

    1

    Before asking AI — 2 minutes of your own thinking
    Skim the CV. Listen to the recording. Read the cover letter. Make a mental note. Then use AI to verify or expand. This is the only rule that requires discipline — the rest is easy.

    2

    AI accelerates, it doesn’t initiate
    Use the model to improve something you’ve already thought through — not to fill the gap before the thought exists. The difference is subtle, but the consequences are completely different.

    3

    Every decision about a person must be yours
    AI can recommend, rank, suggest. But „we’re hiring” and „we’re not hiring” are sentences backed by a human — not a model. This isn’t rhetoric. It’s accountability.

    AI is like a good assistant — not a good recruiter

    A good assistant takes the admin off your plate, speeds up research, prepares drafts. But it doesn’t replace your judgement of a person. It can’t sense the tension in a candidate’s voice. It won’t notice that someone is excellent despite an imperfect CV. It won’t feel that despite perfect competencies — someone won’t fit your team’s culture.

    These things are non-transferable. And that’s exactly why they’re your advantage — regardless of how many models appear on the market.


    Tools mentioned: Granola (granola.so) | ChatGPT (openai.com) | Claude (claude.ai) | Bielik (speakleash.org) | Data: LinkedIn Talent Trends 2024

  • How to Teach AI to Non-Programmers — Lessons from Industrial Implementations

    How to Teach AI to Non-Programmers — Lessons from Industrial Implementations

    2018. I’m standing with a group of foundry operators at Krakodlew in Kraków. They’re about to put on VR headsets for the first time in their lives. One of them — 20 years at the furnace, deeply experienced — tells me: „This is for young people. I know what I’m doing.”

    Two hours later, that same person asks me when the next session is.

    Teaching technology — especially AI and AI-powered tools — follows the same pattern. The entry barrier is psychological, not technical. And when you break through it with the right method, the rest comes surprisingly quickly.

    Over the last several years I’ve worked with people who had never written a line of code: machine operators, production managers, executives, vocational school students. Here’s what actually works.

    Mistake #1: start with the technology

    Most AI courses and workshops begin like this: „Today you’ll learn about language models. An LLM is a model trained on large datasets…”

    You lose the learner by the fourth sentence.

    Adults learn differently from children. They don’t absorb abstractions first and then search for applications — they absorb concrete experience first and then find theory to explain it. Andragogy (adult learning theory) has been saying this since the 1960s. The problem is that most AI instructors are engineers who teach the way they themselves learned.

    Rule: start with a problem the participant has today. Not with the technology you want to show them.

    At the foundry, we didn’t start with „how VR works.” We started with: „Let me show you a situation where an accident happened at a similar plant. How would you respond?” The technology became a tool for solving a specific, real problem. Not a demonstration.

    The framework that works: „Touch first, explain after”

    Through VR training, I developed a sequence I call „touch first, explain after.” It works exactly the same in teaching AI and vibe coding:

    • 1. Do it together — don’t explain first. Give the participant a simple task to complete immediately. A mistake in a safe environment teaches more than an hour of lecture.
    • 2. Name it together — when something works (or doesn’t), ask: „What just happened? Why do you think that is?” The learner builds a mental model from their own experience.
    • 3. Now explain the mechanism — only now do concepts like „token,” „context,” „temperature” have meaning. Because you have a concrete situation to attach them to.
    • 4. Give them a problem to solve independently — not „an exercise with instructions.” A problem you haven’t walked through with them before.
    ~40%

    shorter onboarding time after VR vs. traditional training (Krakodlew, 2018)
    0

    incidents caused by procedure gaps in first 6 months (same cohort)

    faster knowledge acquisition in immersive environment vs. classroom (PwC 2020, n=10,000)

    These numbers are from VR training, but the pattern is the same: when you teach through doing, not through listening, outcomes are dramatically different.

    What blocks AI learning — and how to remove it

    Teaching AI and vibe coding, you’ll encounter the same barriers repeatedly:

    „This isn’t for me, I’m not technical enough”

    Antidote: show a specific example of someone similar to the learner who’s already doing it. Not Elon Musk. Someone who runs a small business and uses Claude to write proposals. Someone who’s an operator and learned to program a robot in a course. Identification before aspiration.

    „I don’t know what to say to AI”

    Antidote: give them templates. Not as rigid forms, but as starting points. „I am [role], I’m trying to [goal], I have a problem with [specific].” Learners need scaffolding before they build their own prompting style.

    „How do I know if the answer is correct?”

    This is the best barrier, because it’s legitimate. Critical thinking toward AI output is a key skill — not an obstacle to learning. Teach the learner to verify: cross-check with another source, test „does this make sense in my context,” ask AI a follow-up („check whether you made an error in assumption X”).

    The best AI learners aren’t those who trust it unconditionally — they’re those who can hold a dialogue: ask, verify, correct, iterate.

    Robotics as the entry point to AI for younger learners

    At XR FabLab in Chrzanów, we worked with vocational high school students. The threshold for abstract „AI” was too high. Robotics turned out to be the perfect bridge.

    Why robotics works:

    • Immediate feedback loop — the robot moves or it doesn’t. No ambiguity. The student sees the result of their decision within seconds, not days.
    • Physical artifact — they can touch it, show it to parents, break it and fix it. Code on a screen is abstract. A robot is real.
    • Programming as commands — students intuitively understand a sequence of instructions. It’s a direct analogy to prompting AI.
    • Competitive element — races, challenges, leaderboards. External motivation until intrinsic motivation develops.

    When a student understands that the robot does exactly what they told it to do — no more, no less — they also understand why precision in prompting matters. That’s the transfer moment.

    What separates a good AI instructor from a weak one

    A good AI instructor isn’t someone who has the API documentation memorised. It’s someone who:

    • Remembers what it felt like to not know — and can genuinely inhabit the learner’s perspective
    • Gets excited when a student surprises them with a new application of the tool
    • Treats a student’s mistake as diagnostic information, not failure
    • Is still learning themselves — AI changes every few months, and standing still means falling behind

    The most important sentence you can say to a learner: „I don’t know, let’s find out together.” That models exactly the kind of thinking AI requires.

    A practical starting framework

    If you’re designing an AI course or workshop from scratch, one structure I’d recommend:

    • Session 1: Participant’s problem → first contact with the tool → surprise (good or bad) → shared analysis
    • Session 2: Building vocabulary → prompt templates → first independent task
    • Session 3+: Participant’s own project → iteration → presenting the result

    The own project is critical. When a participant solves their own problem — not a practice exercise — and AI helps them do it, that’s when the tool becomes theirs.

    Who in your organisation or classroom is hardest to convince about AI — and what’s their main barrier?


    Author has been implementing VR training in industrial environments since 2018 (Krakodlew, XR FabLab Chrzanów). Teaches AI, automation and AI-assisted coding in practical contexts — in Poland and internationally. Sources: PwC VR Soft Skills Study 2020 (n=10,000); own data Krakodlew/Industrverse 2018–2024.

  • Vibe Coding: How I Built 3 Applications Without Hiring a Developer

    Vibe Coding: How I Built 3 Applications Without Hiring a Developer

    In 2023, I decided to build a platform for property investors. Plot purchase checklists, budget tracking, document management, experts, land maps — all in one place. Mobile app for iOS and Android, plus web.

    I had zero developers on the team.

    I wasn’t looking for a freelancer either. Instead, I opened Claude Code and started writing prompts.

    Today, naswoim.org is live. Users logged in. Data in Supabase. Mobile version in the App Store and Google Play. And I built it in parallel with two other projects.

    What vibe coding actually is

    Vibe coding isn’t the same as no-code. You still write code. But you don’t write it alone from scratch — you write it together with an AI model that generates implementation based on your specification.

    It’s also not about pasting questions into ChatGPT and copying the output. It’s about thinking precisely about a problem and describing it in a way that AI can translate into working code.

    Working definition: Vibe coding is a model where you are the product owner and architect, and AI is the implementation partner. You decide what and why. AI decides how.

    This shifts what’s required. You don’t need to know how to write a React hook with optimistic UI updates. You need to know that you need one — and why.

    Three products, zero external developers

    3

    applications built in vibe coding model
    0

    external developers hired
    4

    platforms: web, iOS, Android and B2B SaaS

    marcinpaszkiewicz.com — this site. Astro SSR, WordPress Headless as CMS, deployed on Vercel. Every change — new article, layout fix, new feature — is built through Claude Code. Time from idea to live change: usually under an hour.

    naswoim.org — full stack: React 19 + Vite, Tailwind CSS, Supabase (PostgreSQL + Auth + Storage + Realtime), Capacitor for iOS and Android. Twenty pages, fourteen domain hooks, twenty-four feature components. Row-level security, 20 SQL migrations, Leaflet maps integration. Built entirely without hiring.

    industrverse.com — the most technically ambitious. Frontend: Next.js + React 19. Backend: NestJS with 13 modules, Prisma + PostgreSQL, Redis, JWT/Passport, Socket.io for real-time VR sessions, Swagger, Google API integrations. Docker Compose with four services. npm monorepo. Seven user roles. Nine role-specific dashboards.

    None of these projects would have been feasible within a reasonable time and budget without AI as a programming partner. With AI — each one became executable for a single person.

    My workflow

    There’s no single tool. I use several depending on the task:

    • Claude Code (CLI) — for deep file-level work: new features, refactoring, debugging with full project context. It understands repo structure, reads multiple files simultaneously, and proposes changes down to the line.
    • Claude.ai — for architecture planning, design decision discussions, rapid questions at the design stage. I use this to talk through what I’m building before writing a line of code.
    • GitHub Copilot — for in-editor autocomplete, especially for repetitive patterns (components, tests, SQL migrations).

    The iteration loop: define the task precisely → AI generates a scaffold → I review and guide → AI refines → I test → repeat. The key word is „guide” — this isn’t an autopilot. You’re driving.

    What AI does well

    • Scaffolding and boilerplate — new API endpoint, new component, new SQL migration. AI knows what these look like and generates them quickly and correctly.
    • Debugging with context — paste the error + relevant code → AI diagnoses. Usually accurately. This saves hours of Stack Overflow searches.
    • Refactoring — renaming, restructuring, adding TypeScript types to existing JavaScript. AI is fast and precise at this.
    • Cross-layer translation — „I have this logic on the backend, write the corresponding frontend hook.” AI understands both sides.

    Where humans are still required

    Vibe coding doesn’t eliminate the need for thinking. It changes what you think about.

    • Architecture decisions — how to split the system, where to put logic, what tradeoffs to accept. AI suggests, but you decide.
    • UX and product design — what a user should feel at a given moment. AI doesn’t know your users.
    • Security — any code touching auth, sensitive data, or permissions requires your verification. AI makes mistakes in RLS, input validation, API exposure.
    • Complex state in large codebases — above a certain project size, AI loses context. You need to guide it precisely.

    Vibe coding doesn’t replace understanding code — it shifts the boundary of what’s achievable without years of programming experience. It’s not magic. It’s leverage.

    What I learned along the way

    The biggest trap: treating AI as a code vending machine. You paste a description, receive code, paste it into the project without reading. This ends with security bugs, inconsistent style, and technical debt you’ll only understand six months later.

    The second trap: prompts that are too vague. „Write me a budget management app” produces generic, unusable output. „Write a useProjectBudget hook that subscribes to the expenses table in Supabase via Realtime, aggregates totals per category, and returns isLoading, error, totals, and addExpense” — gives you exactly what you need.

    Vibe coding doesn’t require knowing every line of code. It requires thinking precisely about problems — and that’s a skill you can deliberately develop.

    Where to start

    If you want to start building in vibe coding mode, three things give the highest return:

    • Git basics — commit, push, branch, merge. Without this you can’t experiment safely. Git is your safety net.
    • How to read error messages — you don’t need to understand every line of code, but you do need to read an error message and paste it to AI with context. That’s 80% of debugging.
    • SQL fundamentals — most applications have a database. Understanding SELECT, INSERT, JOIN and basic schema design lets you guide AI through data layers.

    Next step: pick a tool. Claude Code, Cursor, GitHub Copilot — each has strengths. Start with one, on a small project. Not a database-backed application right away.

    What problem would you solve with your first vibe coding project?


    Author has been building in vibe coding model since 2023. Stack: naswoim.org (React 19, Supabase, Capacitor), industrverse.com (NestJS, Next.js, PostgreSQL, Redis, Docker), marcinpaszkiewicz.com (Astro SSR, WordPress Headless, Vercel).

  • VR Training vs. E-Learning in Industry: What the Data Says and When VR Wins

    VR Training vs. E-Learning in Industry: What the Data Says and When VR Wins

    Virtual foundry simulator — VR training environment at Krakodlew
    Krakodlew foundry, Kraków — operator during VR training. Mistakes happen here, not at the furnace.

    It’s 2018. I’m standing in front of the board of one of Poland’s largest foundries, explaining why VR training makes sense. The room is full of people who have trained workers for 30 years using the master-apprentice method — and it worked for 30 years.

    The question that followed was predictable: „Why pay for virtual reality when we already have e-learning?”

    It’s a good question. But it’s the wrong one.

    Because the problem isn’t the cost of the technology — it’s knowing what you’re using it for.

    What the data says — PwC 2020

    In 2020, PwC published one of the largest comparative studies of VR training in a corporate setting. 10,000 participants. Three methods: classroom, e-learning, VR. The results were unambiguous.

    Learning speed — index relative to classroom training (PwC 2020, n=10,000)

    VR learners completed training 4× faster than classroom and 2.7× faster than e-learning. Source: PwC, 2020 VR Soft Skills Study

    Four times faster isn’t a statistical margin. It’s the difference between a week-long training programme and a two-day one.

    Confidence in applying learned skills — increase vs. classroom training (PwC 2020)

    VR learners showed 275% higher confidence in applying skills than after classroom training. Source: PwC, 2020 VR Soft Skills Study

    The gap between „I know how to do this” and „I feel confident I can do this” — that’s the 275%. In an industrial environment, that gap translates directly to accidents, errors, and quality failures.

    VR simulator view — virtual foundry environment
    View from the simulator — operator inside the virtual foundry. Every mistake is feedback, not an accident.

    When VR beats e-learning

    E-learning works where the goal is information transfer: safety procedures, regulations, product knowledge. It’s cheap to produce, easy to update, and runs on any laptop.

    VR wins when you need to build a physical reflex and genuine confidence in situations you can’t safely repeat in reality.

    faster knowledge acquisition than classroom training
    275%

    higher confidence in applying skills vs. classroom
    3.75×

    higher emotional engagement than traditional training

    At the foundry, the VR scenarios we trained were emergency situations at the melting furnace — overheating, metal spillage, incorrect temperature readings. None of these can be practised live without serious accident risk.

    VR doesn’t replace e-learning. It solves a different problem — where knowledge alone isn’t enough and the body needs to learn too.

    The economics: when VR stops being expensive

    The main board-level objection is always the same: „It’s expensive.” And it’s true — VR content creation costs more than e-learning. But that’s a content production cost, not a per-trainee training cost.

    Training cost per participant — e-learning vs. VR at scale (estimated model)

    VR cost per participant drops sharply with scale — at 500+ people it becomes comparable to e-learning, while delivering significantly higher effectiveness. Estimated model based on market data 2024.

    Then add the costs e-learning doesn’t eliminate: stopping the production line for training, travel, room hire, an instructor. VR can be delivered at the workstation, in 20 minutes, without stopping production.

    VR simulator — safety procedures at the foundry
    Safety procedure simulator — operator practises emergency response without stopping production.

    When e-learning is the better choice

    VR isn’t the answer to everything. There are situations where e-learning is objectively the right call:

    • Fast content updates — new regulation, product change, procedure update. E-learning can be changed in a day. VR content requires 3D reconstruction.
    • Pure knowledge training — compliance, labour law, general onboarding. You don’t need immersion to remember an emergency phone number.
    • Small one-off groups — if you’re training 5 people once a year, VR content production costs won’t pay back.
    • Any-device access — e-learning works on a phone at home. VR requires hardware.

    The question isn’t „VR or e-learning” — it’s „which process needs which method”.

    From practice: VR training at a foundry

    At Krakodlew, we implemented VR training for safety procedures at melting furnaces. The result: approximately 40% reduction in new operator onboarding time and zero incidents caused by procedure gaps in the first 6 months on the job. A pattern I’ve applied across subsequent industrial implementations.

    A practical decision rule

    Before deciding on a training method, answer three questions:

    • Could a mistake during training be dangerous or very costly in reality? → VR
    • Does the training involve physical actions and procedures, not just knowledge transfer? → VR
    • Will you be training the same content to 50+ people over several years? → VR is cost-effective

    If the answer to all three is „no” — e-learning is the smarter choice.

    Which training in your organisation would be the best candidate for VR?


    Sources: PwC — VR Soft Skills Study 2020 (n=10,000) | Brandon Hall Group — Learning & Development Research | Own data: Krakodlew implementations 2018–2024

  • Robotisation Potential Analysis in a Foundry: Why the Right Priority Is Rarely Obvious

    Robotisation Potential Analysis in a Foundry: Why the Right Priority Is Rarely Obvious

    Krakodlew foundry production hall
    Krakodlew foundry hall, Kraków — one of the places where this story began.

    2018. I’m at one of the largest Polish foundries. Rather than looking for an „obvious” robotisation candidate, I start by building a digital twin of the production hall — a model mapping individual process sections that allows evaluating automation potential without stopping the line.

    Each section is described through a set of parameters: operation frequency, OHS hazard level, process suitability for automation, estimated ROI. Data is collected at every workstation. Only then do I form a recommendation.

    Conceptual diagram — digital twin of the hall: section analysis by hazard level

    Zone A
    Casting cleaning
    ⚠ hazard: CRITICAL

    Zone B
    Melting furnaces
    ⚠ hazard: HIGH

    Zone C
    Quality control
    hazard: LOW

    Zone D
    Palletising
    hazard: LOW

    Conceptual model — the actual digital twin covered over a dozen production sections with full process parametrisation.

    The question from management: where do we invest first?

    The right question changes everything

    I don’t ask „what’s cheapest to automate”. I ask: „where is the human most at risk?”

    This isn’t sentimentality — it’s rational economics. Processes with the highest hazard levels also exhibit other characteristics: high staff turnover, sick leave, medical costs, accident risk with legal consequences. OHS and economics point in the same direction here.

    Analysis criteria, in order of importance:

    • Hazards to life and health (OHS) — does automation eliminate harmful factors: temperature, noise, dust, vibration, accident risk? This is the overriding criterion.
    • Process repeatability — can a robot perform the task identically, every time, without exceptions?
    • Return on investment (ROI) — when does the investment pay off and what are the risks?
    • Technical maturity — is the technology stable enough for a production environment?
    • Line resilience — what happens to production when the robot stops unexpectedly?

    Candidate evaluation — robotisation potential analysis (0–10, primary criterion: OHS)

    Higher score = higher robotisation priority. Primary criterion: OHS hazard level and impact on operator health.

    Recommendation #1: cleaning of large-format castings

    The most physically destructive work on the floor — grinding, heavy lifting, extreme heat, noise, dust, vibration. The operator is simultaneously exposed to all major harmful factors. No other analysed process showed such a cumulation of hazards.

    Harmful factors — large-format casting cleaning

    Temperature

    extreme
    Noise

    >100 dB
    Metal dust

    high
    Vibration

    continuous
    Load

    heavy

    This isn’t just ethics. Large-format casting cleaning also has characteristics that favour automation — high repeatability of large-casting geometry, no complex decision-making, measurable outcome (surface quality). Protecting people and economic logic point in the same direction here.

    Recommendation #2: temperature measurement in the furnace

    Foundry interior — melting furnaces
    Melting furnaces in the foundry — where temperature measurement accuracy directly affects casting quality and process safety.

    The second priority — equally non-arbitrary. Working at temperature measurement near the furnace involves direct exposure to thermal radiation and the risk of extreme thermal events. Additional argument: an incorrect measurement means a defective casting — a direct financial loss and quality risk for the client. High repeatability, zero error tolerance, clear ROI.

    Robotisation potential analysis is not a wishlist of what’s cheapest to implement. It’s a ranked argument for where technology creates the greatest value — for people and the process simultaneously.

    Poland vs the world: the scale of the challenge

    Data from the International Federation of Robotics (IFR 2024) shows where we stand:

    Robotisation density — robots per 10,000 manufacturing workers (IFR 2024)

    Source: International Federation of Robotics — World Robotics 2024

    Poland has 81 robots per 10,000 workers — almost 3× below the EU average and 5× below Germany (429/10,000). We’re the largest robotics market in Central and Eastern Europe, but the gap is enormous. That’s precisely why the quality of automation decisions matters so much here — there’s no room for expensive mistakes.

    A warning: when the „obvious choice” costs billions

    GIFA 2019 trade fair in Düsseldorf
    GIFA 2019 in Düsseldorf — the world’s largest foundry industry trade fair. This is where we presented our first VR implementation for industry.

    Tesla’s 2017–2018 story is a textbook example of a mistake that happens across every industry. Tesla installed hundreds of industrial robots to produce 5,000 cars per week. The result? It couldn’t produce even 2,500.

    Elon Musk publicly acknowledged: „Excessive automation was a mistake. Humans are underrated.”

    2,500

    cars/week instead of the planned 5,000 — despite full automation
    1–3 yrs

    typical ROI timeframe — simpler applications (cobots, palletising) under a year
    6–36 mo.

    ROI depends on project: simple cobots under a year, complex systems 2–3 years

    Robots perform repetitive tasks in stable environments — and they excel at it. Humans remain irreplaceable where flexibility and judgment in exceptional situations are required. Hybrid solutions are optimal — not full automation.

    What actually gets automated

    Most commonly automated processes in manufacturing (% of companies, Windward Studios 2024)

    Operator during VR training
    Operator during VR training at Krakodlew — technology supporting humans, not replacing them.

    Global statistics show the dominance of palletising and material handling — processes with clear ROI and low hazard. In a foundry environment, where harmful factor accumulation is exceptionally high, the analysis starts from a different place.

    Conclusion: potential analysis, not a wishlist

    After this analysis in 2018, I returned to management with a recommendation that surprised many. They expected me to point to a cheaper, faster-to-implement target. The argument was simple: if we can’t justify prioritising robotisation where the stakes are an operator’s health and life, we can’t justify it anywhere.

    Robotisation potential analysis is not a political document. It’s a ranking in which protecting people and creating lasting process value go hand in hand — because only such projects make sense in the long run.

    Have you ever seen a process at your facility that „everyone knew should be automated” — but nobody calculated the true cost of not automating it for the people doing it?


    Sources: IFR — World Robotics 2024 (ifr.org) | B. Büchel, D. Floreano, IMD Case Study: Tesla 2018 | S. Gibbs, The Guardian, 16.04.2018 | AutomatykaOnline.pl — ROI of automation | pro-assem.pl — industrial robot applications | CentrumMaszynCNC.pl — process selection criteria | Photos: personal archive, Krakodlew / GIFA 2019