initial commit
This commit is contained in:
@@ -0,0 +1,74 @@
|
||||
---
|
||||
epic: 1
|
||||
story: 1.1
|
||||
title: "Lexer & Tokenizer"
|
||||
status: draft
|
||||
---
|
||||
|
||||
## Epic 1 — Core Calculation Engine (Rust Crate)
|
||||
**Goal:** Build `calcpad-engine` as a standalone Rust crate that powers all platforms. This is the foundation.
|
||||
|
||||
### Story 1.1: Lexer & Tokenizer
|
||||
|
||||
As a CalcPad engine consumer,
|
||||
I want input lines tokenized into a well-defined token stream,
|
||||
So that the parser can build an AST from structured, unambiguous tokens rather than raw text.
|
||||
|
||||
**Acceptance Criteria:**
|
||||
|
||||
**Given** an input line containing an integer such as `42`
|
||||
**When** the lexer tokenizes the input
|
||||
**Then** it produces a single `Number` token with value `42`
|
||||
**And** no heap allocations occur for this simple expression
|
||||
|
||||
**Given** an input line containing a decimal number such as `3.14`
|
||||
**When** the lexer tokenizes the input
|
||||
**Then** it produces a single `Number` token with value `3.14`
|
||||
|
||||
**Given** an input line containing a negative number such as `-7`
|
||||
**When** the lexer tokenizes the input
|
||||
**Then** it produces tokens representing the negation operator and the number `7`
|
||||
|
||||
**Given** an input line containing scientific notation such as `6.022e23`
|
||||
**When** the lexer tokenizes the input
|
||||
**Then** it produces a single `Number` token with value `6.022e23`
|
||||
|
||||
**Given** an input line containing SI scale suffixes such as `5k`, `2.5M`, or `1B`
|
||||
**When** the lexer tokenizes the input
|
||||
**Then** it produces `Number` tokens with values `5000`, `2500000`, and `1000000000` respectively
|
||||
|
||||
**Given** an input line containing currency symbols such as `$20`, `€15`, `£10`, `¥500`, or `R$100`
|
||||
**When** the lexer tokenizes the input
|
||||
**Then** it produces `CurrencySymbol` tokens paired with their `Number` tokens
|
||||
**And** multi-character symbols like `R$` are recognized as a single token
|
||||
|
||||
**Given** an input line containing unit suffixes such as `5kg`, `200g`, or `3.5m`
|
||||
**When** the lexer tokenizes the input
|
||||
**Then** it produces `Number` tokens followed by `Unit` tokens
|
||||
|
||||
**Given** an input line containing arithmetic operators `+`, `-`, `*`, `/`, `^`, `%`
|
||||
**When** the lexer tokenizes the input
|
||||
**Then** it produces the corresponding `Operator` tokens
|
||||
|
||||
**Given** an input line containing natural language operators such as `plus`, `minus`, `times`, or `divided by`
|
||||
**When** the lexer tokenizes the input
|
||||
**Then** it produces the same `Operator` tokens as their symbolic equivalents
|
||||
**And** `divided by` is recognized as a single two-word operator
|
||||
|
||||
**Given** an input line containing a variable assignment such as `x = 10`
|
||||
**When** the lexer tokenizes the input
|
||||
**Then** it produces an `Identifier` token, an `Assign` token, and a `Number` token
|
||||
|
||||
**Given** an input line containing a comment such as `// this is a note`
|
||||
**When** the lexer tokenizes the input
|
||||
**Then** it produces a `Comment` token containing the comment text
|
||||
**And** the comment token is preserved for display but excluded from evaluation
|
||||
|
||||
**Given** an input line containing plain text with no calculable expression
|
||||
**When** the lexer tokenizes the input
|
||||
**Then** it produces a `Text` token representing the entire line
|
||||
|
||||
**Given** an input line containing mixed content such as `$20 in euro - 5% discount`
|
||||
**When** the lexer tokenizes the input
|
||||
**Then** it produces tokens for the currency value, the conversion keyword, the currency target, the operator, the percentage, and the keyword
|
||||
**And** each token includes its byte span (start, end) within the input
|
||||
Reference in New Issue
Block a user