Patent Claim Generation by Fine-Tuning OpenAI GPT-2

Jieh Hsiang; Jieh-Sheng Lee

arxiv: 1907.02052 · v1 · pith:QGS5OX6Inew · submitted 2019-07-01 · 💻 cs.CL · cs.LG· stat.ML

Patent Claim Generation by Fine-Tuning OpenAI GPT-2

Jieh-Sheng Lee , Jieh Hsiang This is my paper

Pith reviewed 2026-05-25 12:31 UTC · model grok-4.3

classification 💻 cs.CL cs.LGstat.ML

keywords patent claim generationGPT-2fine-tuningtext generationnatural language processingmachine learningaugmented invention

0 comments

The pith

Fine-tuning GPT-2 produces the first machine-generated patent claims.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates that GPT-2, a pre-trained language model, can be fine-tuned to generate patent claims by training on existing claim text. It exploits the particular way patent claims are written, which carries implicit annotations from human drafters. A sympathetic reader would care because successful generation could one day allow machines to assist in inventing by drafting claims automatically. The work examines the fine-tuning process in its early stages and tests generation with different sampling methods. It also proposes a new sampling method and provides an email bot for further experiments.

Core claim

We are the first to generate patent claims by machines and the first to apply GPT-2 to patent claim generation. By fine-tuning the model on patent claims and using their unique language structure, the model produces coherent text under both conditional and unconditional sampling.

What carries the argument

Fine-tuned GPT-2 model leveraging the unique language structure in patent claims as implicit human annotations to learn claim generation.

If this is right

Patent claims can be generated automatically from the fine-tuned model.
The quality can be assessed through qualitative analysis of generated samples.
A new sampling approach for text generation is proposed.
An email bot enables other researchers to interact with the model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the generated claims prove legally sound at scale, patent drafting costs could decrease significantly.
The technique may extend to generating other types of legal or technical documents with similar structures.
Longer training beyond the first 100 steps might yield even more coherent and complex claims.

Load-bearing premise

That qualitative inspection of text generated after only the first 100 training steps can meaningfully assess the coherence and legal utility of the claims.

What would settle it

A panel of patent lawyers evaluating a sample of generated claims and concluding that they do not meet legal standards for novelty, support, or clarity would show the approach does not work.

read the original abstract

In this work, we focus on fine-tuning an OpenAI GPT-2 pre-trained model for generating patent claims. GPT-2 has demonstrated impressive efficacy of pre-trained language models on various tasks, particularly coherent text generation. Patent claim language itself has rarely been explored in the past and poses a unique challenge. We are motivated to generate coherent patent claims automatically so that augmented inventing might be viable someday. In our implementation, we identified a unique language structure in patent claims and leveraged its implicit human annotations. We investigated the fine-tuning process by probing the first 100 steps and observing the generated text at each step. Based on both conditional and unconditional random sampling, we analyze the overall quality of generated patent claims. Our contributions include: (1) being the first to generate patent claims by machines and being the first to apply GPT-2 to patent claim generation, (2) providing various experiment results for qualitative analysis and future research, (3) proposing a new sampling approach for text generation, and (4) building an e-mail bot for future researchers to explore the fine-tuned GPT-2 model further.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Fine-tuning GPT-2 on patent claims is a straightforward domain adaptation that stops short of showing usable output.

read the letter

The paper applies GPT-2 fine-tuning to patent claim text, which is a new domain for the model at the time. It notes the distinctive structure of claims and uses that for training, then looks at generated text from the first 100 steps under both conditional and unconditional sampling. It also introduces a sampling variant and sets up an email bot so others can query the model. Those pieces are practical and give future work something to build on or test against. The claim of being first in this exact task holds up based on the references given. The work is honest about its scope and focuses on qualitative observation rather than overclaiming metrics it does not have. The soft spot is the evaluation. Stopping at step 100 and judging coherence by eye does not establish that the model produces claims with proper antecedent structure or statutory language. Patent text is precise, and early snapshots often capture only surface patterns. No perplexity on held-out data, no baseline comparisons, and no later checkpoints are reported, so the central result stays unverified. This is the kind of paper that might interest a small group working on legal-domain language models or early GPT-2 experiments. A reader looking for evidence that machines can generate usable patent claims will not find it here. The citation pattern is normal and does not hide prior work. I would not send this to peer review in its current state. It needs quantitative results from a converged model and some form of validation before it merits referee time.

Referee Report

3 major / 2 minor

Summary. The paper reports on fine-tuning OpenAI's GPT-2 model on patent claim data to generate new patent claims. It claims to be the first to do so, examines the fine-tuning by looking at generated text in the first 100 steps using conditional and unconditional sampling, proposes a new sampling approach, and provides an email bot for further exploration by researchers.

Significance. Should the approach yield coherent and legally sound patent claims upon proper evaluation, this work would mark a significant step in applying large pre-trained language models to the specialized and structured domain of patent claims, potentially facilitating AI-augmented invention processes. The qualitative results and the provided exploration tool offer a starting point for the community.

major comments (3)

[Abstract] The evaluation of the generated patent claims' quality and coherence is based exclusively on qualitative observation of outputs from the first 100 fine-tuning steps (Abstract), without any quantitative metrics, held-out test set evaluation, or expert legal review, which is insufficient to substantiate the central claim of producing usable claims.
[Abstract] No comparisons to baselines or prior methods for text generation in technical domains are reported (Abstract), making it difficult to gauge the relative performance of the fine-tuned GPT-2 model.
[Abstract] The assertion of being the first to generate patent claims by machines and apply GPT-2 to this task lacks supporting discussion of related work on patent text processing or claim generation (Abstract).

minor comments (2)

[Abstract] The description of the 'unique language structure in patent claims' and how it was leveraged is not detailed enough for reproducibility.
The paper would benefit from including the dataset size, source, and preprocessing steps in the main text.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive feedback. We address each major comment point by point below.

read point-by-point responses

Referee: [Abstract] The evaluation of the generated patent claims' quality and coherence is based exclusively on qualitative observation of outputs from the first 100 fine-tuning steps (Abstract), without any quantitative metrics, held-out test set evaluation, or expert legal review, which is insufficient to substantiate the central claim of producing usable claims.

Authors: The work is explicitly positioned as an exploratory study of the fine-tuning process on patent claims rather than a claim of producing legally usable outputs. We acknowledge the evaluation is limited to qualitative observations and will revise the abstract and introduction to more clearly state the exploratory scope, limitations, and that no claims are made regarding legal soundness or immediate usability. revision: partial
Referee: [Abstract] No comparisons to baselines or prior methods for text generation in technical domains are reported (Abstract), making it difficult to gauge the relative performance of the fine-tuned GPT-2 model.

Authors: As the first reported application of GPT-2 (or any LM) to patent claim generation, no task-specific baselines existed at the time. We will add a short discussion of related text generation techniques in technical domains to provide context for relative performance. revision: yes
Referee: [Abstract] The assertion of being the first to generate patent claims by machines and apply GPT-2 to this task lacks supporting discussion of related work on patent text processing or claim generation (Abstract).

Authors: We agree a discussion of prior patent text processing work would strengthen the novelty claim. We will add a related work paragraph covering relevant patent analysis and generation literature. revision: yes

standing simulated objections not resolved

Provision of quantitative metrics, held-out test set results, or expert legal review, as these were outside the scope of the original exploratory study and cannot be added without new experiments.

Circularity Check

0 steps flagged

Empirical fine-tuning experiment contains no circular derivations or self-referential claims

full rationale

The paper describes an applied ML experiment: fine-tuning GPT-2 on patent claims and qualitatively inspecting generated text at early training steps. No equations, fitted parameters, predictions, or first-principles derivations are present. The central claims (first application of GPT-2 to this task, new sampling approach) are supported by reported experimental observations rather than any reduction to inputs by construction. No self-citations are invoked as load-bearing uniqueness theorems. The work is self-contained as an empirical demonstration without the circular patterns enumerated in the analysis criteria.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that patent claim text contains learnable structure that standard language-model fine-tuning can capture without additional supervision or constraints.

axioms (1)

domain assumption Patent claim language possesses a unique implicit structure that can be exploited by language-model fine-tuning.
Stated in the abstract as motivation for the approach.

pith-pipeline@v0.9.0 · 5726 in / 1090 out tokens · 20775 ms · 2026-05-25T12:31:24.463355+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 7 internal anchors

[1]

Deep contextualized word representations

M.E.Peters, M.Neumann, M.Iyyer, M.Gardner, C.Clark, K.Lee, L.Zettlemoyer, Deep contextualized word representations, (2018). https://arxiv.org/abs/1802.05365 (accessed April10, 2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[2]

A.Radford, K.Narasimhan, T.Salimans, I.Sutskever, Improving Language Understanding by Generative Pre-Training (transformer in real world), (n.d.) 1–12

work page
[3]

A.Radrof, J.Wu, R.Child, D.Luan, D.Amodei, I.Sutskever, Language Models are Unsupervised Multitask Learners, (2018)

work page 2018
[4]

2019 Conf

J.Devlin, M.-W.Chang, K.Lee, K.Toutanova, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, in: Proc. 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Vol. 1 (Long Short Pap., 2019: pp. 4171–4186. https://aclweb.org/anthology/papers/N/N19/N 19-1423/

work page 2019
[5]

BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model

A.Wang, K.Cho, BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model, (2019). http://arxiv.org/abs/1902.04094 (accessed March1, 2019)

work page internal anchor Pith review Pith/arXiv arXiv 2019
[6]

https://github.com/openai/gpt-2 (accessed June2, 2019)

OpenAI, GPT-2 source code, (n.d.). https://github.com/openai/gpt-2 (accessed June2, 2019)

work page 2019
[7]

L.Aristodemou, F.Tietze, The state-of-the-art on Intellectual Property Analytics (IPA): A literature review on artificial intelligence, machine learning and deep learning methods for analysing intellectual property (IP) data, World Pat. Inf. 55 (2018) 37–51. doi:10.1016/j.wpi.2018.07.002

work page doi:10.1016/j.wpi.2018.07.002 2018
[8]

M.Lupu, Information retrieval, machine learning, and Natural Language Processing for intellectual property information, World Pat. Inf. 49 (2017) A1–A3. doi:10.1016/j.wpi.2017.06.002

work page doi:10.1016/j.wpi.2017.06.002 2017
[9]

Attention Is All You Need

A.Vaswani, N.Shazeer, N.Parmar, J.Uszkoreit, L.Jones, A.N.Gomez, L.Kaiser, I.Polosukhin, Attention Is All You Need, (2017). http://arxiv.org/abs/1706.03762 (accessed December24, 2018)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[10]

http://ruder.io/nlp-imagenet/

S.Ruder, NLP’s ImageNet moment has arrived, (n.d.). http://ruder.io/nlp-imagenet/

work page
[11]

https://developer.uspto.gov/

USPTO, USPTO Open Data Portal, (n.d.). https://developer.uspto.gov/

work page
[12]

https://console.cloud.google.com/bigquery?p =patents-public-data

Google, Google Patents Public Datasets on BigQuery, (n.d.). https://console.cloud.google.com/bigquery?p =patents-public-data

work page
[13]

https://github.com/minimaxir/gpt-2-simple

M.Woolf, gpt-2-simple, (n.d.). https://github.com/minimaxir/gpt-2-simple

work page
[14]

https://data.mendeley.com/datasets/b8853hnj 7b/draft?a=b99308ff-c24b-428c-96d7- d851962a2714

gpt2-claims-2013_for_345M.npz, (n.d.). https://data.mendeley.com/datasets/b8853hnj 7b/draft?a=b99308ff-c24b-428c-96d7- d851962a2714

work page
[15]

https://data.mendeley.com/datasets/9dvny7cg cz/draft?a=6ba92bff-b464-4665-90c9- 8e03f1ba4a13

gpt2-claims-2013.txt, (n.d.). https://data.mendeley.com/datasets/9dvny7cg cz/draft?a=6ba92bff-b464-4665-90c9- 8e03f1ba4a13

work page 2013
[16]

https://openai.com/blog/better-language- models/ (accessed June3, 2019)

OpenAI, Better Language Models and Their Implications, (n.d.). https://openai.com/blog/better-language- models/ (accessed June3, 2019)

work page 2019
[17]

https://colab.research.google.com (accessed June2, 2019)

Google Colaboratory, (n.d.). https://colab.research.google.com (accessed June2, 2019)

work page 2019
[18]

https://github.com/nshepperd/gpt- 2/blob/finetuning/src/memory_saving_gradie nts.py

N.Shepperd, memory_saving_gradients.py, (n.d.). https://github.com/nshepperd/gpt- 2/blob/finetuning/src/memory_saving_gradie nts.py

work page
[19]

https://github.com/huggingface/pytorch- pretrained-BERT (accessed June3, 2019)

H.Face, The Big-&-Extending-Repository-of- Transformers: Pretrained PyTorch models for Google’s BERT, OpenAI GPT & GPT-2, Google/CMU Transformer-XL., (n.d.). https://github.com/huggingface/pytorch- pretrained-BERT (accessed June3, 2019)

work page 2019
[20]

https://data.mendeley.com/datasets/cgy6ng9k wm/draft?a=c9b6c696-f768-46de-b531- 7c2b5479bc50

First 100 steps of fine-tuning GPT-2, (n.d.). https://data.mendeley.com/datasets/cgy6ng9k wm/draft?a=c9b6c696-f768-46de-b531- 7c2b5479bc50

work page
[21]

Visualizing Attention in Transformer-Based Language Representation Models

J.Vig, Visualizing Attention in Transformer- Based Language Representation Models, (2019). http://arxiv.org/abs/1904.02679 (accessed April26, 2019)

work page internal anchor Pith review Pith/arXiv arXiv 2019
[22]

The Curious Case of Neural Text Degeneration

A.Holtzman, J.Buys, M.Forbes, Y.Choi, The Curious Case of Neural Text Degeneration, (2019). http://arxiv.org/abs/1904.09751 (accessed May21, 2019)

work page internal anchor Pith review Pith/arXiv arXiv 2019
[23]

https://data.mendeley.com/datasets/wftfn4rs4 p/draft?a=009a0411-eb5c-4dcf-bc0d- 4995842a38ae

Unconditional sampling results, (n.d.). https://data.mendeley.com/datasets/wftfn4rs4 p/draft?a=009a0411-eb5c-4dcf-bc0d- 4995842a38ae

work page
[24]

https://data.mendeley.com/datasets/sp3g6c4m c5/draft?a=df172e77-ed9f-4b9d-9339- e1e8305a6d3d

Conditional sampling results (1), (n.d.). https://data.mendeley.com/datasets/sp3g6c4m c5/draft?a=df172e77-ed9f-4b9d-9339- e1e8305a6d3d

work page
[25]

https://data.mendeley.com/datasets/dnxdrgr3h 6/draft?a=173df973-966c-4c1f-b3f6- 1417d6aa1f4a

Conditional sampling results (2), (n.d.). https://data.mendeley.com/datasets/dnxdrgr3h 6/draft?a=173df973-966c-4c1f-b3f6- 1417d6aa1f4a. 11

work page
[26]

Story Ending Prediction by Transferable BERT

Z.Li, X.Ding, T.Liu, Story Ending Prediction by Transferable BERT, (2019). http://arxiv.org/abs/1905.07504 (accessed June2, 2019)

work page internal anchor Pith review Pith/arXiv arXiv 2019
[27]

T.Pires, E.Schlinger, D.Garrette, How multilingual is Multilingual BERT?, ArXiv1906.01502v1 [Cs]. (2019). http://arxiv.org/abs/1906.01502v1 (accessed June10, 2019). Appendix A  The following SQL selects the first claims of all US utility patents in 2013 and aggregate s the CPC codes at subclass level: (data source: Google Patents Public Datasets on BigQu...

work page internal anchor Pith review Pith/arXiv arXiv 2019

[1] [1]

Deep contextualized word representations

M.E.Peters, M.Neumann, M.Iyyer, M.Gardner, C.Clark, K.Lee, L.Zettlemoyer, Deep contextualized word representations, (2018). https://arxiv.org/abs/1802.05365 (accessed April10, 2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[2] [2]

A.Radford, K.Narasimhan, T.Salimans, I.Sutskever, Improving Language Understanding by Generative Pre-Training (transformer in real world), (n.d.) 1–12

work page

[3] [3]

A.Radrof, J.Wu, R.Child, D.Luan, D.Amodei, I.Sutskever, Language Models are Unsupervised Multitask Learners, (2018)

work page 2018

[4] [4]

2019 Conf

J.Devlin, M.-W.Chang, K.Lee, K.Toutanova, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, in: Proc. 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Vol. 1 (Long Short Pap., 2019: pp. 4171–4186. https://aclweb.org/anthology/papers/N/N19/N 19-1423/

work page 2019

[5] [5]

BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model

A.Wang, K.Cho, BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model, (2019). http://arxiv.org/abs/1902.04094 (accessed March1, 2019)

work page internal anchor Pith review Pith/arXiv arXiv 2019

[6] [6]

https://github.com/openai/gpt-2 (accessed June2, 2019)

OpenAI, GPT-2 source code, (n.d.). https://github.com/openai/gpt-2 (accessed June2, 2019)

work page 2019

[7] [7]

L.Aristodemou, F.Tietze, The state-of-the-art on Intellectual Property Analytics (IPA): A literature review on artificial intelligence, machine learning and deep learning methods for analysing intellectual property (IP) data, World Pat. Inf. 55 (2018) 37–51. doi:10.1016/j.wpi.2018.07.002

work page doi:10.1016/j.wpi.2018.07.002 2018

[8] [8]

M.Lupu, Information retrieval, machine learning, and Natural Language Processing for intellectual property information, World Pat. Inf. 49 (2017) A1–A3. doi:10.1016/j.wpi.2017.06.002

work page doi:10.1016/j.wpi.2017.06.002 2017

[9] [9]

Attention Is All You Need

A.Vaswani, N.Shazeer, N.Parmar, J.Uszkoreit, L.Jones, A.N.Gomez, L.Kaiser, I.Polosukhin, Attention Is All You Need, (2017). http://arxiv.org/abs/1706.03762 (accessed December24, 2018)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[10] [10]

http://ruder.io/nlp-imagenet/

S.Ruder, NLP’s ImageNet moment has arrived, (n.d.). http://ruder.io/nlp-imagenet/

work page

[11] [11]

https://developer.uspto.gov/

USPTO, USPTO Open Data Portal, (n.d.). https://developer.uspto.gov/

work page

[12] [12]

https://console.cloud.google.com/bigquery?p =patents-public-data

Google, Google Patents Public Datasets on BigQuery, (n.d.). https://console.cloud.google.com/bigquery?p =patents-public-data

work page

[13] [13]

https://github.com/minimaxir/gpt-2-simple

M.Woolf, gpt-2-simple, (n.d.). https://github.com/minimaxir/gpt-2-simple

work page

[14] [14]

https://data.mendeley.com/datasets/b8853hnj 7b/draft?a=b99308ff-c24b-428c-96d7- d851962a2714

gpt2-claims-2013_for_345M.npz, (n.d.). https://data.mendeley.com/datasets/b8853hnj 7b/draft?a=b99308ff-c24b-428c-96d7- d851962a2714

work page

[15] [15]

https://data.mendeley.com/datasets/9dvny7cg cz/draft?a=6ba92bff-b464-4665-90c9- 8e03f1ba4a13

gpt2-claims-2013.txt, (n.d.). https://data.mendeley.com/datasets/9dvny7cg cz/draft?a=6ba92bff-b464-4665-90c9- 8e03f1ba4a13

work page 2013

[16] [16]

https://openai.com/blog/better-language- models/ (accessed June3, 2019)

OpenAI, Better Language Models and Their Implications, (n.d.). https://openai.com/blog/better-language- models/ (accessed June3, 2019)

work page 2019

[17] [17]

https://colab.research.google.com (accessed June2, 2019)

Google Colaboratory, (n.d.). https://colab.research.google.com (accessed June2, 2019)

work page 2019

[18] [18]

https://github.com/nshepperd/gpt- 2/blob/finetuning/src/memory_saving_gradie nts.py

N.Shepperd, memory_saving_gradients.py, (n.d.). https://github.com/nshepperd/gpt- 2/blob/finetuning/src/memory_saving_gradie nts.py

work page

[19] [19]

https://github.com/huggingface/pytorch- pretrained-BERT (accessed June3, 2019)

H.Face, The Big-&-Extending-Repository-of- Transformers: Pretrained PyTorch models for Google’s BERT, OpenAI GPT & GPT-2, Google/CMU Transformer-XL., (n.d.). https://github.com/huggingface/pytorch- pretrained-BERT (accessed June3, 2019)

work page 2019

[20] [20]

https://data.mendeley.com/datasets/cgy6ng9k wm/draft?a=c9b6c696-f768-46de-b531- 7c2b5479bc50

First 100 steps of fine-tuning GPT-2, (n.d.). https://data.mendeley.com/datasets/cgy6ng9k wm/draft?a=c9b6c696-f768-46de-b531- 7c2b5479bc50

work page

[21] [21]

Visualizing Attention in Transformer-Based Language Representation Models

J.Vig, Visualizing Attention in Transformer- Based Language Representation Models, (2019). http://arxiv.org/abs/1904.02679 (accessed April26, 2019)

work page internal anchor Pith review Pith/arXiv arXiv 2019

[22] [22]

The Curious Case of Neural Text Degeneration

A.Holtzman, J.Buys, M.Forbes, Y.Choi, The Curious Case of Neural Text Degeneration, (2019). http://arxiv.org/abs/1904.09751 (accessed May21, 2019)

work page internal anchor Pith review Pith/arXiv arXiv 2019

[23] [23]

https://data.mendeley.com/datasets/wftfn4rs4 p/draft?a=009a0411-eb5c-4dcf-bc0d- 4995842a38ae

Unconditional sampling results, (n.d.). https://data.mendeley.com/datasets/wftfn4rs4 p/draft?a=009a0411-eb5c-4dcf-bc0d- 4995842a38ae

work page

[24] [24]

https://data.mendeley.com/datasets/sp3g6c4m c5/draft?a=df172e77-ed9f-4b9d-9339- e1e8305a6d3d

Conditional sampling results (1), (n.d.). https://data.mendeley.com/datasets/sp3g6c4m c5/draft?a=df172e77-ed9f-4b9d-9339- e1e8305a6d3d

work page

[25] [25]

https://data.mendeley.com/datasets/dnxdrgr3h 6/draft?a=173df973-966c-4c1f-b3f6- 1417d6aa1f4a

Conditional sampling results (2), (n.d.). https://data.mendeley.com/datasets/dnxdrgr3h 6/draft?a=173df973-966c-4c1f-b3f6- 1417d6aa1f4a. 11

work page

[26] [26]

Story Ending Prediction by Transferable BERT

Z.Li, X.Ding, T.Liu, Story Ending Prediction by Transferable BERT, (2019). http://arxiv.org/abs/1905.07504 (accessed June2, 2019)

work page internal anchor Pith review Pith/arXiv arXiv 2019

[27] [27]

T.Pires, E.Schlinger, D.Garrette, How multilingual is Multilingual BERT?, ArXiv1906.01502v1 [Cs]. (2019). http://arxiv.org/abs/1906.01502v1 (accessed June10, 2019). Appendix A  The following SQL selects the first claims of all US utility patents in 2013 and aggregate s the CPC codes at subclass level: (data source: Google Patents Public Datasets on BigQu...

work page internal anchor Pith review Pith/arXiv arXiv 2019