ESTree

大家应该都已经知道了Babel处理JS代码的方式，就是解析 => 转换 => 生成三个步骤，然后解析里面又包括了词法分析和语法分析，和这个过程息息相关的就是AST（抽象语法树），Babel的核心工具库都是和AST相关的。在讲Babel的核心库之前有必要介绍一下Babel AST的标准，babel的AST标准是以ESTree作为基准的，只不过部分属性有些偏差，官方文档是这么说的：

The Babel parser generates AST according to Babel AST format. It is based on ESTree spec with the following deviations:

Literal token is replaced with StringLiteral, NumericLiteral, BigIntLiteral, BooleanLiteral, NullLiteral, RegExpLiteral

Property token is replaced with ObjectProperty and ObjectMethod

MethodDefinition is replaced with ClassMethod

Program and BlockStatement contain additional directives field with Directive and DirectiveLiteral

ClassMethod, ObjectProperty, and ObjectMethod value property's properties in FunctionExpression is coerced/brought into the main method node.

ChainExpression is replaced with OptionalMemberExpression and OptionalCallExpression

ImportExpression is replaced with a CallExpression whose callee is an Import node.

ESTree主要提供了Identifier、Literal、Statements、Declarations、Expressions 几种类型的节点，每种节点可能会细分成更多的子节点，下面来看一下这些节点的结构：

Node

Node结构可以理解为是所有节点的基类，Node节点中的属性在其他AST节点中都存在。Node节点的结构如下：

interface Node {
    type: string;
    loc: SourceLocation | null;
}
interface SourceLocation {
    source: string | null;
    start: Position;
    end: Position;
}
interface Position {
    line: number; // >= 1
    column: number; // >= 0
}

Identifier

标识，函数名、变量名、方法名都属于Identifier，Identifier基本结构如下（不包含Node的基本节点）：

interface Identifier {
    type: "Identifier";
    name: string;
}

Literal

文字类型，变量的值（字符串、数字、bool值、null等）、正则表达式，在Babel中对应的类型就是stringLiteral、numericLiteral等等，Literal基本结构如下：

interface Literal {
    type: "Literal";
    value: string | boolean | null | number | RegExp;
}

Statements

Statement就是我们代码的语句，Statement有非常多的子类型，比如ExpressionStatement、BlockStatement、ReturnStatement、IfStatement、WhileStatement等等，以ExpressionStatement举例，其结构如下：

interface ExpressionStatement <: Statement {
    type: "ExpressionStatement";
    expression: Expression;
}

Declarations

Declaration代表声明、定义，和Statement一样在AST中出现的是他的子类型，主要有：FunctionDeclaration、VariableDeclaration：

interface FunctionDeclaration {
    type: "FunctionDeclaration";
    id: Identifier;
}
interface VariableDeclaration {
    type: "VariableDeclaration";
    declarations: [ VariableDeclarator ];
    kind: "var";
}

Expressions

Expressions就是表达式，主要有FunctionExpression、ObjectExpression、ArrayExpression、ThisExpression等等，以BinaryExpression为例：

interface BinaryExpression {
    type: "BinaryExpression";
    operator: BinaryOperator;
    left: Expression;
    right: Expression;
}

Babel核心工具库

上面介绍了Babel AST的基本结构，接下来就可以进入主题来讲讲Babel提供给我们的核心的工具库。 Babel主要提供了以下几个核心工具库：

@babel/parser
@babel/core
@babel/types
@babel/traverse
@babel/generator
@babel/template

@babel/parser

parser 是 Babel 的JS解析器，可以将JS代码转换成符合 estree 规范的AST结构。 parser 功能强大，本身就支持最新的ES规范、JSX、Flow、TS，也支持在提案中的JS语法。 @babel/parser 主要提供两个API：parse 和 parseExpression，两个API功能类似，区别就在于parse会把代码段当做一整个程序代码，而 parseExpression 则代码段当做一个表达式来解析，如果你想使用 parseExpression 来解析一个Declaration ，那会将得到一个 SyntaxError: Unexpected token 的错误。下面以一个简单的例子介绍下parser的使用，并对比两个api的差异：首先写一个我们准备parse的代码段：

const parser = require("@babel/parser")
const code = 'functin foo(a, b) {return a + b}'
console.log(parser.parse(code))
console.log(parser.parseExpression(code))

parse API得到的结果是：

{
  "type": "File",
  // ...
  "program": {
    "type": "Program",
    // ...
    "sourceType": "script",
    "interpreter": null,
    "body": [
      {
        "type": "FunctionDeclaration",
        "id": {
          "type": "Identifier",
           "identifierName": "foo"
         },
         "name": "foo"
        },
        "generator": false,
        "async": false,
        "params": [
         // ...
        ],
        "body": {
         // ...
        }
      }
    ],
    "directives": []
  },
  "comments": []
}

可以看到，使用type为File的节点开始的。再来看看 parseExpression：

{
  "type": "FunctionExpression",
  "id": {
    "type": "Identifier",
    "name": "foo"
  },
  "generator": false,
  "async": false,
  "params": [
    // ...  
  ],
  "body": {
    "type": "BlockStatement",
    "body": [
      {
        "type": "ReturnStatement",
        // ...
      }
    ],
    "directives": []
  }
}

如果我的代码里有jsx或ts语法parser不能直接解析怎么办呢？如下例：

const reactCode = '<div><p>Hello Babel</p></div>'
const reactCodeRes = parser.parseExpression(reactCode);

直接解析jsx语法将会报下面的错误：

SyntaxError: This experimental syntax requires enabling one of the following parser plugin(s): 'jsx, flow, typescript' (1:0)

告诉我们需要添加plugins，parser是如何添加plugins的呢？和之前babel基础使用中讲到的是一样的plugins吗？这里的答案是否定的，parser的plugins比较特别，上面也提到了，parser功能非常强大支持了各种语法，所以在parse或parseExpression方法中传入第二个参数，第二个参数是一个对象，这个对象有一个plugins属性，plugins属性是一个可枚举的数组, plugins支持很多值，比如bigInt、decorators等等。这里我们要兼容jsx语法，所以plugins中补充了jsx这个plugin，然后再次进行parse可以看到成功解析了

{
  "type": "JSXElement",
  "openingElement": {
    "type": "JSXOpeningElement",
    // ...
  },
  // ...
  "children": [
    {
      "type": "JSXElement",
      // ...
    }
  ]
}

@babel/traverse

得到了AST之后，我们就要对AST中的节点进行特定的处理。在处理对应节点之前我们必须拿到对应的节点，有了AST一层一层的遍历下去获取到想要的节点是可以的，但是这样手写总归过于繁琐而且容易出bug，babel提供了 traverse 这个库来帮助我们更好的遍历整个AST。 @Babel/traverse 会以深度优先遍历的形式遍历整棵AST树，通过一个 visitor 对象，遍历到对应类型的节点，从 visitor 中找到对应的处理方法来进行处理，举个简单的例子：首先解析得到一棵非常简单的代码的AST树，

const parser = require("@babel/parser")
const traverse = require('@babel/traverse').default

const code = 'if(a === 1){} else {}'
const codeAST = parser.parse(declarationCode, {});

创建一个Visitor，用来处理节点：

const visitor = {
    BlockStatement: (path) => {
        console.log(path.node)    
    },
    BinaryExpression: (path) => {
        console.log(path.node)
    }
}
traverse(declarationCodeAST, declarationVisitor)

打印的结果如下（省去了很多字段）：可以看到打印出了 BlockStatement 和 BinaryExpression 两种类型的节点。

BinaryExpression Node {
  type: 'BinaryExpression',
  extra: undefined,
  left:
   Node {
     type: 'Identifier',
     name: 'a' },
  operator: '===',
  right:
   Node {
     type: 'NumericLiteral',
     extra: { rawValue: 1, raw: '1' },
     value: 1 } }

BlockStatement Node {
  type: 'BlockStatement',
  body: [],
  directives: [] }

BlockStatement Node {
  type: 'BlockStatement',
  body: [],
  directives: [] 
}

visitor内的字段还可以这样定义：

const visitor = {
    BinaryExpression: {
        enter(path) {
            console.log('BinaryExpression enter')
        },
        exit(path) {
            console.log('BinaryExpression exit')
        }
    }
}

这样当遍历到 BinaryExpression 节点的时候回调用enter方法，当处理完该节点，准备退出当前节点处理下一个节点的时候调用exit方法。这个时候长的比较帅的读者可能会问了，visitor 中每个节点的处理方法中都有一个 path 参数，这个 path 参数具体是个什么呢？path 这个参数一听名字就知道它表示一条路径，具体就是表示两个节点间的路径，这个路径信息包含了该节点和父节点的信息，也包含了一些处理节点的方法，比如增、删一个节点，path 属性有：

parent：父节点
key：当前节点在父节点对应的key
node：当前节点的具体内容
scope：作用域
context：上下文信息
...

在使用traverse的时候一般都是处理node节点，就像上面的例子一样，我们打印了path.node，打印出来的结果就是当前这个AST节点的值。

最佳实践

关于traverse有几条最佳实践可以分享一下：

traverse过程是比较消耗性能的，所以我们要尽可能的把我们的visitor合并起来减少traverse次数

path.traverse({
  Identifier(path) {
    // ...
  }});
path.traverse({
  BinaryExpression(path) {
    // ...
  }
});

合并为一次traverse访问：

path.traverse({
  Identifier(path) {
    // ...
  },
  BinaryExpression(path) {
    // ...
  }
});

优化嵌套的visitor，避免重复创建子visitor，例：

const MyVisitor = {
  FunctionDeclaration(path) {
    path.traverse({
      Identifier(path) {
        // ...
      }
    });
  }};

上面这个例子，每次当我们遍历到FunctionDeclaration节点，都会创建一个新的包含Identifier的visitor，这里就造成了不必要的资源浪费，完全可以把子visitor抽离出去成为一个常量：

const visitorOne = {
  Identifier(path) {
    // ...
  }
};
const MyVisitor = {
  FunctionDeclaration(path) {
    path.traverse(visitorOne);
  }
};

这样抽离出去之后可能会有一种case：我们的子visitor如果需要依赖父节点的一些信息进行判断处理，我们将子visitor抽离出去之后怎么将这些信息传递给子visitor呢？例子如下：

const MyVisitor = {
  FunctionDeclaration(path) {
    var exampleState = path.node.params[0].name;

    path.traverse({
      Identifier(path) {
        if (path.node.name === exampleState) {
          // ...
        }
      }
    });
  }
};

traverse提供了第二个参数帮助我们解决这个问题，第二个参数是一个对象，可以将需要的状态值传入，在子visitor中可以在this上访问到这些值：

const visitorOne = {
  Identifier(path) {
    if (path.node.name === this.exampleState) {
      // ...
    }
  }  
};
const MyVisitor = {
  FunctionDeclaration(path) {
    var exampleState = path.node.params[0].name;
    path.traverse(visitorOne, { exampleState });
  }
};

@babel/types

Babel types就厉害了，它可以帮助我们构建AST的节点，或者用来判断一个节点是不是我们要的类型的节点，举个例子：我们想要生成一个下面这个import react代码的AST结构

import React from 'react'

使用 @babel/types 可以这么写：

const t = require('@babel/types');
t.importDeclaration([t.importDefaultSpecifier(t.identifier('React'))], t.stringLiteral('react'))

得到的结果如下：

{
    "type":"ImportDeclaration",
    "specifiers":[{
        "type":"ImportDefaultSpecifier",
        "local":{
            "type":"Identifier",
            "name":"React"
        }
    }],
    "source":{
        "type":"StringLiteral",
        "value":"react"
    }
}

可以看到得到的AST节点的结构，少了location相关的属性，其他的属性都是比较完全的了当我们有一个AST树的时候想判断一个节点是不是import节点时，可以调用：

t.isImportDeclaration(node, opts)

会返回一个 boolean 值告诉我们这个节点是不是一个ImportDeclaration 具体有哪些类型可以查看官方文档

有了 @babel/types 我们可以生成我们想要的代码的AST，通过下面即将介绍的 generator来生成代码。

@babel/generator

generator从名字我们就可以猜出来是用来生成代码的。没错，@babel/generator 可以通过传入的AST生成代码，就拿我们上面 @babel/types 生成的import react的AST来实验：

const t = require('@babel/types')
const generator = require('@babel/generator').default
const importReactAST= t.importDeclaration([t.importDefaultSpecifier(t.identifier('React'))], t.stringLiteral('react'))

console.log(generator(importReactAST))

最终生成的结果如下：

{ 
    code: 'import React from "react";',
    map: null,
    rawMappings: undefined
}

里面的code就是我们想要的真正的代码。 generator函数当然不只有这一个参数，第二个参数是可选的，接收一个对象，可以配置关于输出内容的格式相关和sourcemap相关的属性，如果有将多个来源的代码生成的AST重新构建JS文件并希望sourcemap能够正确的提供必要的信息，则可以传递第三个参数，也是一个对象，key值应该为源文件名称，value对应为源内容。例：（来自babel官网）

const parse = require('@babel/parser').parse;
const generate = require('@babel/generator').default
const a = "var a = 1;";
const b = "var b = 2;";
const astA = parse(a, { sourceFilename: "a.js" });
const astB = parse(b, { sourceFilename: "b.js" });
const ast = {
  type: "Program",
  body: [].concat(astA.program.body, astB.program.body),
};
const { code, map } = generate(
  ast,
  { sourceMaps: true },
  {
    "a.js": a,
    "b.js": b,
  }
);

总结

本文介绍了Babel提供的核心工具库，主要是@babel/parser、@babel/traverse、@babel/types和@babel/generator，通过这些库可以帮助我们非常方便的获取、操作AST树或者生成我们需要的代码。

Babel 101 - 02 - Babel进阶使用指南