Skip to content

2.2 Checking Conventional Expressions with Corpora

What you will learn on this page

  • What an academic corpus is and how to use it
  • The basics of move analysis
  • The CARS model for Introductions (three moves)
  • The five-move structure commonly used in Abstracts
  • Common expression patterns in Methods, Results, and Discussion
  • The role of hedging (language that avoids overstatement)
  • Systematic choices of reporting verbs
  • How to investigate conventional expressions in your own field
  • A mini phrase list by section

What is an academic corpus?

An academic corpus is a large database of academic texts (for example, research papers). By checking what expressions are actually used in your field, you can develop a data-based sense of what “natural academic English” looks like.

Tool Description URL
CorpusMate A corpus focused on academic English https://corpusmate.com/
COCA (Corpus of Contemporary American English) A general corpus with 1B+ words, including an Academic section https://www.english-corpora.org/coca/
MICUSP (Michigan Corpus of Upper level Student Papers) Highly rated student papers at undergraduate and graduate levels https://elicorpora.info/
Michigan Corpus of Academic Spoken English A corpus of academic spoken English https://quod.lib.umich.edu/m/micase/
Elsevier OA CC-BY Corpus Open access research paper corpus from Elsevier (download required) https://elsevier.digitalcommonsdata.com/datasets/zm33cdndxs/3
Europe PMC Mainly life sciences. Searchable for free, sometimes with “Free full text.” https://europepmc.org/
PubMed Central (PMC) Mainly biomedicine. Many full-text papers available. https://pmc.ncbi.nlm.nih.gov/
arXiv Free preprints across major fields https://arxiv.org/search/
Semantic Scholar Free paper search with links to full text when available https://www.semanticscholar.org/
CORE Aggregated open access papers with search https://core.ac.uk/
OpenAlex (Works) A large open catalog for searching research outputs https://openalex.org/works
The Lens (Scholarly Works) Search and analyze scholarly records (some free access) https://www.lens.org/
BASE (Bielefeld Academic Search Engine) Cross-search of academic web resources, often open access https://www.base-search.net/
bioRxiv Biology preprints, searchable on the site https://www.biorxiv.org/search
medRxiv Medical preprints, searchable on the site https://www.medrxiv.org/search
ACL Anthology Archive for computational linguistics and NLP papers https://aclanthology.org/

Use Google Scholar as a quick-and-dirty corpus

Even without a dedicated corpus tool, you can check rough frequency using Google Scholar.

How: Search a phrase in double quotation marks ("").

The more hits an expression has, the more widely it tends to be used in academic papers. Field bias exists, so treat this as a rough guide.

Basics of move analysis

A move is a rhetorical function that a sentence or paragraph fulfills within a section of a research paper.

A rhetorical function refers to what a sentence or paragraph is doing communicatively, such as “pointing out a gap in prior research,” “stating the study purpose,” or “arguing the significance of findings.” Move analysis focuses not on the content itself, but on what the text is doing for the reader.

Each move is often associated with conventional expressions. For example, purpose statements frequently use patterns like “The aim of this study is to …”.

Why moves and conventional expressions matter

Research papers are a genre, and each genre has conventions shared by a discourse community. Using appropriate move structures and conventional expressions is not only a matter of “English ability” but also a way to signal membership in that expert community. Even within the genre of research articles, typical moves and frequent phrases differ by field (Hyland, 2008). Understanding the conventions in your field is essential.

Using such conventional expressions is not plagiarism. It is not reasonable to claim that common phrases like “studies have shown” are protected by copyright and attributable to a specific author (Ferris, 2011). Rather, using reader-expected phrases can help your writing communicate efficiently (Conrad, 2008).

Swales (1990) proposed the CARS model (Create a Research Space), which is widely used in move analysis. The CARS model

About ''steps'' inside moves

Which steps are included (and which are not) varies by field.

The CARS model for Introductions (three moves)

Move 1: Establishing a territory

  • Claiming the importance of the research area
  • Providing an overview of prior research

Common expression patterns:

  • "X has been widely studied..."
  • "Recent research has shown that..."
  • "There is growing interest in..."

Move 2: Establishing a niche

  • Identifying problems or gaps in prior research

Common expression patterns:

  • "However, few studies have examined..."
  • "Little is known about..."
  • "Previous research has not adequately addressed..."

Move 3: Occupying the niche

  • Stating the purpose of the present study

Common expression patterns:

  • "The present study aims to..."
  • "This paper investigates..."
  • "The purpose of this study is to..."

The five-move structure of Abstracts

Abstracts often follow a conventional move structure, commonly described as five moves.

Move Function Typical expressions
Background Provide context X has become increasingly important... / Recent advances in...
Purpose State the purpose or question The present study aims to... / This paper investigates...
Method Summarize methods We utilized... / Data were collected from... / N participants completed...
Results Report key findings The results showed that... / The analysis revealed...
Conclusion State conclusions or implications These findings suggest that... / The paper concludes by...

The five moves often appear in this order, but some moves may be omitted or reordered depending on the field and journal.

Below is an example move analysis of an applied linguistics Abstract from Mizumoto & Eguchi (2023).

Move structure of an Abstract

Self-checking your Abstract

After writing your Abstract, check whether all five moves are included. Background and Conclusion are especially easy to omit. Confirm the conventions in your field.

AI prompt example: move analysis of an Abstract

Please analyze the following Abstract and identify which sentence(s) correspond to each move:
Background, Purpose, Method, Results, Conclusion.
If any move is missing, point it out and suggest what content should be added.

[Paste the Abstract here]

Move patterns in Methods, Results, and Discussion

The CARS model focuses on the Introduction, but other sections also show typical move patterns.

Moves in Methods

Move Function Typical expressions
Describing participants Who was studied A total of N participants were recruited...
Describing materials/instruments What was used The instrument consisted of... / A questionnaire was developed...
Describing procedures What was done Data were collected over a period of... / Each session lasted approximately...
Describing analysis How data were analyzed The data were analyzed using... / A two-way ANOVA was conducted...

Moves in Results

Move Function Typical expressions
Preparatory information Preconditions and checks Before conducting the main analysis, ... / Preliminary checks indicated...
Reporting results Main findings The analysis revealed that... / A significant difference was found...
Referring to tables/figures Visual presentation As shown in Table 1, ... / Figure 2 illustrates...
Brief comments Minimal commentary This pattern was consistent across... / Notably, ...

Results vs. Discussion

  • In Results, report findings objectively. Deeper interpretation belongs in Discussion. Brief comments (for example, “This result was unexpected...”) can be acceptable.
  • Overusing “This suggests that...” in Results blurs the boundary with Discussion.
  • Some papers combine the two sections as “Results and Discussion.”

Moves in Discussion

Move Function Typical expressions
Summarizing findings Restating key results The findings of the present study indicate that...
Interpreting findings Explaining meaning A possible explanation for this result is that...
Comparing with prior work Positioning in literature This finding is consistent with... / In contrast to Smith (2020), ...
Stating implications Significance These results have implications for...
Stating limitations Constraints A limitation of this study is that... / The sample was limited to...
Future research Next steps Future research should examine... / Further investigation is warranted...

AI prompt example: a presence check of Discussion moves

Please analyze the following Discussion section and check whether it includes these six moves:
(1) summary of findings, (2) interpretation, (3) comparison with prior research,
(4) implications, (5) limitations, and (6) future research.

If any move is missing or the order seems unnatural, point it out.
Do not add new content.

[Paste the Discussion here]

For a practical prompt that evaluates roles, order, and balance at the paragraph level, see:
3.4 Writing Results and Discussion

Hedging (avoiding overstatement)

In academic writing, hedging is important for calibrating the strength of claims. It is especially frequent in Discussion sections in the humanities and social sciences, where interpretations must be presented carefully. Hedging also functions as a rhetorical resource that invites reader agreement by leaving room for evaluation.

Typical hedging expressions

Use the Academic Phrasebank

If you want more conventional expressions by move and section, the University of Manchester’s Academic Phrasebank is helpful.

Choosing reporting verbs

The verb you choose when referring to prior work signals your stance.

A stance-based classification

Stance Example verbs Typical meaning
Neutral reporting reported, found, observed, noted, described Reporting without strong evaluation
Strong claims argued, claimed, maintained, asserted, contended The cited author takes a strong position
Suggesting/proposing suggested, indicated, implied, proposed, recommended A cautious or tentative stance
Demonstrating/confirming demonstrated, confirmed, established, showed, proved Strong support based on evidence
Critiquing/challenging questioned, challenged, criticized, disputed, rejected Disagreement with prior work

Common problems for Japanese users of academic English

  • Overusing “said”: said is rare in academic writing. Use verbs such as reported, noted, and argued.
  • Misjudging strength: proved and demonstrated are very strong. If the evidence is not that definitive, prefer suggested or indicated.
  • Repeating a single verb: Repeating found can sound monotonous. Vary verbs according to stance.

AI prompt example: checking reporting verbs

Please extract all reporting verbs used in the following Introduction (or Literature Review)
and list them in a table.

Then check:
(1) Is the same verb used three times or more?
(2) Are very strong verbs (e.g., proved, demonstrated) used in contexts that should be more neutral?
(3) Are there any verbs that do not fit the context?

If there are issues, point them out and suggest one alternative verb for each case.

[Paste the text here]

Hands-on: investigating expressions in your field

Method 1: Build a small corpus yourself

  • Prepare 3–10 relevant papers in your field as PDFs (more is better).
  • Download AntConc. On macOS, CasualConc is also recommended.
  • After installing AntConc, load PDFs using the steps below.
    • “File” > “Open Corpus Manager” > Corpus Source: Raw File(s)
      > select PDFs > click [Create] > click [Return to Main Window]
    • For a single file: “File” > “Open File(s) as ‘Quick Corpus’”
  • Enter phrases from this page and check how they are used.

    KWIC example in AntConc

  • For AntConc usage, see the official tutorial playlist:
    AntConc tutorials

  • If your target papers are in PLOS ONE, you can use AntCorGen to build a corpus automatically.
  • In the latest AntConc, various LLMs can be used via ChatAI, so you can also ask questions about phrases you searched (API required).

    AntConc ChatAI

Method 2: Combine generative AI with corpora

It can be efficient to ask generative AI for expression patterns by move and then validate them with a corpus tool (AntConc).

AI prompt example: proposing common expressions

Please list 10 English expression patterns that are commonly used in empirical applied linguistics papers
for Move 2 (identifying a gap) in the Introduction.

Note: Do not quote or invent papers. Provide general expression patterns only.

You can also examine collocations (which words co-occur with a target word).

AI prompt example: checking collocations

Please list 10 nouns that commonly collocate with "significant" in academic English
(e.g., significant difference, significant effect).
For each, indicate a rough frequency level (high/medium/low) and which section it is mainly used in.

Key points for validation

  • For AI-suggested collocations, confirm with a corpus and also with Google Scholar phrase searches (for example, "significant improvement").
  • If you are unsure whether an AI-suggested phrase is actually used in research papers, search your corpus or Google Scholar and confirm.
  • For field-specific content and expressions, a small corpus you build yourself can be more reliable than generative AI.
  • To write in a way that signals membership in your discourse community, check expressions in real papers from your field.

A mini phrase list by section

Below are some high-frequency phrases by section. Check whether they are common in your field using your corpus or Google Scholar.

Introduction

Function Example phrases
Claiming importance X plays a crucial role in... / X has attracted considerable attention...
Summarizing prior work A number of studies have investigated... / Previous research has focused on...
Identifying a gap To date, no study has... / There remains a need for...
Stating purpose This study seeks to... / The aim of this paper is to...

Methods

Function Example phrases
Participants A total of N participants (M age = ...) took part in...
Procedures The experiment was conducted in... / Data collection took place over...
Analysis Descriptive statistics were computed for... / To examine..., a t-test was performed.

Results

Function Example phrases
Introducing results Table 1 presents the descriptive statistics for...
Reporting significance A statistically significant difference was observed, t(df) = ..., p < ...
Reporting effect size The effect size was medium (Cohen's d = ...)

Discussion

Function Example phrases
Restating findings The most notable finding was that...
Interpreting findings One possible explanation is that... / This may be attributed to...
Consistency with prior work This finding aligns with... / This is in line with the results of...
Inconsistency with prior work This result contradicts... / Unlike Smith (2020), the present study found...
Limitations This study has several limitations. First, ...

AI prompt example: checking phrase naturalness

Please check the academic phrases in the following text and determine:
(1) whether any expressions are unnatural or nonstandard, and
(2) whether there are more common alternatives.

For each issue, provide one alternative.

[Paste the text here]