ArkTypeの仕組みを調べて、簡易版の実装をした

作成日： 2026-06-09

タグ： #TypeScript

きっかけ

ArkTypeを触っていて、type("string > 5")みたいに文字列で型を書くと、その場で型補完も実行時バリデーションも効くのが不思議だった

ArkType

調べてみると、ArkTypeは同じ文法を実行時と型レベルで2回書いていた。

実行時: 文字列を1文字ずつ読んでバリデーターを組み立てる
型レベル: 同じ読み方を型の機能だけでもう1回書いて、型を推論する

つまり"string > 5"という文字列の読み方が2セットあって、どちらも同じルールで読むから結果がズレない、という作りだった。

参照：arktypeio/arktype （[email protected]）

実行時

実行時のパースは普通の文字列処理。文字を1個ずつ進めるshiftがあって、ここでは配列をi++してる。

ark/util/scanner.ts#L16-L18

export class Scanner {
  shift() {
    return this.chars[this.i++] ?? "";
  }
}

省略しているが、parse処理ではshiftで1文字読んで、その文字で分岐するのを繰り返す。>や<なら範囲チェック、|ならunion（「AまたはB」の型）として扱う。

ark/type/parser/shift/operator/operator.ts#L18-L40

export const parseOperator = (s) => {
  const lookahead = s.scanner.shift();
  return lookahead === "" ? s.finalize("") // 文字列の終わり → 確定
    : lookahead === "|" ? s.pushRootToBranch(lookahead) // union
    : isKeyOf(lookahead, comparatorStartChars) ? parseBound(s, lookahead) // "string > 5"
    : lookahead === "%" ? parseDivisor(s) // "number % 2"
    : ...;
};

"string | number"を渡すと、以下のように進む。

stringを読む → 「文字列か検証する」パーツができる
|を読む → pushRootToBranch("|")が、できたパーツをunion候補の置き場に移す
numberを読む → 「数値か検証する」パーツができる
文字列の終わりに着く → finalizeが候補をまとめて「文字列または数値」のバリデーターにする

pushRootToBranchやfinalizeが値を返してない（void）のは、状態sを書き換えていくスタイルだから。パース結果は戻り値じゃなくsに溜まっていく。

型レベル

型レベルでも実行時と同じことをTypeScriptの型の機能で行っている。

shiftにあたる部分ではテンプレートリテラル型を使っており、1文字取り出すのを表現している。JSの文字列テンプレートと同じ`${}`記法を型に使えるやつで、型レベルのshiftはこの1行だった。

// "先頭1文字 lookahead" と "残り unscanned" を連結したもの
type shift<lookahead extends string, unscanned extends string> = `${lookahead}${unscanned}`;

ark/util/scanner.ts#L103-L106

単体だと「2つの文字列を合わせた形」を表してるだけだが、inferと組み合わせると、先頭1文字と残りに分解する。

type Result = "string" extends Scanner.shift<infer lookahead, infer unscanned>
  ? [lookahead, unscanned] // ["s", "tring"]
  : never;

"string"を当てるとlookahead = "s"、unscanned = "tring"に分かれる。実行時のi++の代わりがこれ。

分岐も実行時と同じ。1文字取り出して、|ならunion、>などのcomparatorなら範囲チェック、と同じ順番で並んでいる。

type parseOperator<s, $, args> =
  s["unscanned"] extends Scanner.shift<infer lookahead, infer unscanned>
    ? lookahead extends "|" ? s.reduceBranch<s, lookahead, unscanned>
      : lookahead extends ComparatorStartChar ? parseBound<s, lookahead, unscanned, $, args>
      : lookahead extends "%" ? parseDivisor<s, unscanned>
      : ...
    : s.finalize<s, "">;

ark/type/parser/shift/operator/operator.ts#L42-L62

手段は配列のi++とテンプレートリテラル + inferで違うが、1対1で対応していそうだった。

次はこれを最小サイズで自作してみる。

作ったもの

対応構文は3つだけ: "string" / "number > 5" / "string | number"

① 型レベルパーサー

本家と同じくテンプレートリテラル型で文字列を分解する。

type Keyword = "string" | "number" | "boolean";

// 前後の空白を1個ずつ剥がす
type Trim<S extends string> = S extends ` ${infer R}`
  ? Trim<R>
  : S extends `${infer L} `
    ? Trim<L>
    : S;

// "string" → string のようにキーワードを型に変換する
type InferBase<S extends string> = S extends "string"
  ? string
  : S extends "number"
    ? number
    : S extends "boolean"
      ? boolean
      : never;

// "number > 5" のような空白入りの式から base 部分(number)だけ取り出す
type InferOperand<S extends string> = S extends `${infer B} ${string}`
  ? InferBase<Trim<B>>
  : InferBase<S>;

type Infer<S extends string> = S extends `${infer L}|${infer R}`
  ? Infer<Trim<L>> | Infer<Trim<R>> // "|" で分けて union を再帰的に組み立てる
  : InferOperand<Trim<S>>;

入口はInferで、やってることは2つ。

"string | number"のように|があれば、前後に分けてそれぞれを再帰的に処理 → string | numberになる
|がなければ"number > 5"のnumber部分だけ見て型にする。> 5は実行時にだけ意味がある制約なので、型としては捨てていい

② 実行時チェック

本家は1文字ずつshiftする状態機械だったが、扱う文法が3つだけならsplitでできる。"number > 5"を空白で割れば["number", ">", "5"]になるので、それを見るだけ。

const checkOperand = (def: string, value: unknown): string | null => {
  const [base, op, limit] = def.trim().split(/\s+/);
  if (typeof value !== base) return `must be ${base}`;
  if (!op) return null;

  // string は長さ、number は値そのものを範囲チェックの対象にする
  const size = base === "string" ? (value as string).length : (value as number);
  const n = Number(limit);
  if (op === ">" && size > n) return null;
  if (op === ">=" && size >= n) return null;
  if (op === "<" && size < n) return null;
  if (op === "<=" && size <= n) return null;
  return `must be ${def.trim()} (got ${size})`;
};

問題なければnull、ダメなら理由の文字列を返す。

③ ①と②を束ねる

最後にtype()で2つを合体する。同じ文字列defが、値としては②のcheckOperandに、型としては①のInfer<S>に流れる。ここが本家の縮図。union は|で割って、どれか1つでも通れば合格。

type Result<T> = { value: T } | { error: string };

type Type<T> = ((data: unknown) => Result<T>) & { infer: T };

const type = <const S extends string>(def: S): Type<Infer<S>> => {
  const branches = def.split("|");
  const fn = (data: unknown): Result<Infer<S>> => {
    const problems = branches.map((b) => checkOperand(b, data));
    if (problems.some((p) => p === null)) return { value: data as Infer<S> };
    if (problems.length === 1) return { error: problems[0]! };
    return { error: `must be one of: ${problems.join(" / ")}` };
  };
  return Object.assign(fn, { infer: undefined as unknown as Infer<S> });
};

戻り値の型Type<Infer<S>>で型補完が効く
中身のcheckOperandが実行時バリデーションをやる

動かす

本家と同じく、文字列を渡すと検証関数が返ってくる。

const Name = type("string > 5");
const Age = type("number");
const Mix = type("string | number");

console.log(Name("Alan Turing"));
console.log(Name("Bob"));
console.log(Age("42"));
console.log(Mix(true));

実行すると以下のようになる。

text

{ value: "Alan Turing" }
{ error: "must be string > 5 (got 3)" }
{ error: "must be number" }
{ error: "must be one of: must be string / must be number" }

わかったこと

型レベルと実行時で同じ文法を別々に書いて、ズレないようにしていた。
本家はそのズレなさをattestで「TypeScriptの型を実行時に取り出してテスト」して保証してるらしい
- 次はこっちもみてみたい
テンプレートリテラル型とかinferでの文字列分解とかは勉強になった。

あとがき

本家を読むなら (1) 実行時のparse*系 → (2) 同じ文法のInfer系 → (3) 一致を保証するattest、の順がよさそう
次は配列"number[]"対応をやると型レベル再帰の感覚が掴めそう