Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation
This work proposes a syntax-enhanced grammatical error correction (GEC)
approach named SynGEC that effectively incorporates dependency syntactic
information into the encoder part of GEC models. The key challenge for this
idea is that off-the-shelf parsers are unreliable when processing ungrammatical
sentences. To confront this challenge, we propose to build a tailored
GEC-oriented parser (GOPar) using parallel GEC training data as a pivot. First,
we design an extended syntax representation scheme that allows us to represent
both grammatical errors and syntax in a unified tree structure. Then, we obtain
parse trees of the source incorrect sentences by projecting trees of the target
correct sentences. Finally, we train GOPar with such projected trees. For GEC,
we employ the graph convolution network to encode source-side syntactic
information produced by GOPar, and fuse them with the outputs of the
Transformer encoder. Experiments on mainstream English and Chinese GEC datasets
show that our proposed SynGEC approach consistently and substantially
outperforms strong baselines and achieves competitive performance. Our code and
data are all publicly available at https://github.com/HillZhang1999/SynGEC.