Learning with fuzzy hypergraphs: a topical approach to query-oriented text summarization

Hadrien Van Lierde; Tommy W. S. Chow

arxiv: 1906.09445 · v1 · pith:FUQGGQEQnew · submitted 2019-06-22 · 💻 cs.CL · cs.AI

Learning with fuzzy hypergraphs: a topical approach to query-oriented text summarization

Hadrien Van Lierde , Tommy W. S. Chow This is my paper

Pith reviewed 2026-05-25 18:13 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords fuzzy hypergraphtopic modelextractive summarizationquery-oriented summarizationsubmodular optimizationHierarchical Dirichlet Processsemantic similaritycontent coverage

0 comments

The pith

A fuzzy hypergraph with topics as hyperedges lets a summarizer pick sentences that share meaning even when they share few words.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Existing graph methods for extractive summarization connect sentences only when they share words, missing cases where different wording conveys the same information. The paper infers topic distributions for each sentence using a Hierarchical Dirichlet Process model, then builds a fuzzy hypergraph whose nodes are sentences and whose hyperedges are the topics, with membership degrees reflecting how strongly each sentence belongs to each topic. A submodular objective is defined that jointly maximizes a sentence set's relevance to a user query, its centrality within this hypergraph, and its coverage of the corpus topics; the objective is maximized by a polynomial-time greedy algorithm. Experiments on standard benchmarks show the resulting summaries achieve higher content coverage than prior lexical-graph and hypergraph baselines.

Core claim

The paper claims that replacing lexical edges with fuzzy topic hyperedges allows an extractive summarizer to select a sentence set whose total semantic coverage, measured by topic overlap, exceeds what lexical-similarity graphs can achieve while still respecting query relevance and centrality constraints.

What carries the argument

fuzzy hypergraph whose nodes are sentences and whose hyperedges are topics drawn from a Hierarchical Dirichlet Process, with each sentence-topic pair carrying a continuous membership weight

If this is right

The submodular formulation guarantees a polynomial-time algorithm with a known approximation guarantee.
Summaries produced by the method exhibit higher content coverage scores than those from prior graph-based systems.
The same hypergraph can be queried for different user inputs without recomputing the topic structure.
Centrality and coverage terms can be traded off against query relevance by adjusting the objective weights.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same fuzzy-hypergraph construction could be applied to sentence clustering or passage retrieval where semantic rather than lexical grouping is desired.
If a stronger topic model replaced the Hierarchical Dirichlet Process, the hyperedges would become sharper and the coverage objective would improve without changing the rest of the pipeline.
The approach assumes a static corpus; extending it to streaming documents would require an online topic-inference step whose effect on the hypergraph optimization remains untested.

Load-bearing premise

Topic distributions inferred by the Hierarchical Dirichlet Process model accurately capture semantic similarities between lexically dissimilar sentences.

What would settle it

Run the method and a lexical baseline on a held-out corpus containing known clusters of semantically equivalent but lexically divergent sentences; the fuzzy-hypergraph summaries should include a measurably larger fraction of each cluster's sentences.

Figures

Figures reproduced from arXiv: 1906.09445 by Hadrien Van Lierde, Tommy W. S. Chow.

**Figure 1.** Figure 1: System Chart 4 Maximizing Relevance and Topical Coverage based on a sentence fuzzy hypergraph We describe each step of our MRC algorithm in details, including preprocessing, topic modelling, fuzzy hypergraph construction and sentence selection through the maximization of sentence relevance and topical coverage. 4.1 Preprocessing We apply standard preprocessing methods in text mining including stopword re… view at source ↗

**Figure 2.** Figure 2: Representation of two-level Hierarchical Dirichlet Process. [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: ROUGE-SU4 and Lexical Diversity as a function of [PITH_FULL_IMAGE:figures/full_fig_p024_3.png] view at source ↗

**Figure 4.** Figure 4: ROUGE-SU4 and Lexical Diversity as a function of [PITH_FULL_IMAGE:figures/full_fig_p025_4.png] view at source ↗

**Figure 5.** Figure 5: ROUGE-SU4 and Lexical Diversity as a function of [PITH_FULL_IMAGE:figures/full_fig_p025_5.png] view at source ↗

read the original abstract

Existing graph-based methods for extractive document summarization represent sentences of a corpus as the nodes of a graph or a hypergraph in which edges depict relationships of lexical similarity between sentences. Such approaches fail to capture semantic similarities between sentences when they express a similar information but have few words in common and are thus lexically dissimilar. To overcome this issue, we propose to extract semantic similarities based on topical representations of sentences. Inspired by the Hierarchical Dirichlet Process, we propose a probabilistic topic model in order to infer topic distributions of sentences. As each topic defines a semantic connection among a group of sentences with a certain degree of membership for each sentence, we propose a fuzzy hypergraph model in which nodes are sentences and fuzzy hyperedges are topics. To produce an informative summary, we extract a set of sentences from the corpus by simultaneously maximizing their relevance to a user-defined query, their centrality in the fuzzy hypergraph and their coverage of topics present in the corpus. We formulate a polynomial time algorithm building on the theory of submodular functions to solve the associated optimization problem. A thorough comparative analysis with other graph-based summarization systems is included in the paper. Our obtained results show the superiority of our method in terms of content coverage of the summaries.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper replaces lexical graphs with fuzzy hypergraphs whose hyperedges are HDP topics, but the abstract gives no experimental details to back the superiority claim.

read the letter

The main point is a shift from lexical-similarity graphs to a fuzzy hypergraph model where HDP-derived topics serve as the hyperedges. Sentences become nodes and each topic connects a group of sentences with varying membership degrees, which is meant to link lexically dissimilar but semantically related content for query-oriented extractive summarization. The optimization then balances query relevance, graph centrality, and topic coverage under a submodular objective that admits a polynomial-time greedy algorithm. That formulation is straightforward and stays within established submodular theory without adding extra machinery. The move to topical fuzzy hyperedges is the concrete novelty relative to the lexical hypergraph baselines the abstract cites. The approach is internally consistent on its own terms and directly targets a known weakness of word-overlap methods. The soft spot is the complete absence of experimental substance in the abstract. No datasets, no baselines, no metrics beyond a vague reference to content coverage, and no diagnostic showing that the HDP sentence-level posteriors actually correlate with semantic overlap rather than noise or function words. The stress-test concern about short sentences producing unreliable topic assignments therefore stands unaddressed. If the full paper contains only the same high-level claim without ablations or human correlation checks, the advantage over simpler lexical or embedding baselines remains unproven. This is for people already working on graph-based or topic-aware extractive summarization who want to try a hypergraph variant. A reader in that narrow sub-area could extract the model and test it themselves. It is coherent enough to warrant peer review so the experiments can be examined, but the current abstract alone does not make a strong case.

Referee Report

2 major / 2 minor

Summary. The paper proposes a fuzzy hypergraph model for query-oriented extractive summarization in which sentences are nodes and topics inferred via Hierarchical Dirichlet Process (HDP) serve as fuzzy hyperedges to capture semantic (rather than purely lexical) similarities. It formulates a submodular optimization objective that jointly maximizes query relevance, hypergraph centrality, and topic coverage, and solves it in polynomial time. The abstract asserts that comparative experiments demonstrate superiority in content coverage over prior graph-based methods.

Significance. If the empirical claims hold after proper validation, the work would usefully extend lexical graph and hypergraph summarization techniques by incorporating topic-based semantic links, with the submodular formulation providing a clear algorithmic advantage. The combination of HDP with fuzzy hypergraphs for summarization is a coherent technical step, though its impact depends on whether the topic distributions actually deliver the claimed semantic connections on short sentences.

major comments (2)

[Abstract] Abstract: the central claim of superiority in content coverage is load-bearing for the contribution, yet the abstract (and the description provided) supplies no information on datasets, baselines, metrics, statistical significance, or controls; without these, the experimental support for the method cannot be evaluated.
[Method] Method section (topic model and fuzzy hypergraph construction): the claim that HDP-derived topic distributions supply meaningful fuzzy hyperedges for lexically dissimilar but semantically related sentences is the key modeling assumption, but no diagnostic (sentence-pair correlation with human judgments, embedding similarities, or ablation versus TF-IDF) is referenced; if this assumption fails on short sentences, the fuzzy hypergraph reduces to a noisier lexical graph and the superiority claim does not follow.

minor comments (2)

Notation for fuzzy membership degrees and the precise definition of hyperedge weights should be clarified with an explicit equation or example.
The polynomial-time algorithm is stated to build on submodular theory; a brief reference to the specific submodular function properties used (e.g., monotonicity, submodularity proof sketch) would aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the opportunity to respond to the referee's comments. We address each major comment below and will make revisions to improve the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim of superiority in content coverage is load-bearing for the contribution, yet the abstract (and the description provided) supplies no information on datasets, baselines, metrics, statistical significance, or controls; without these, the experimental support for the method cannot be evaluated.

Authors: We agree that the abstract is concise and would benefit from additional context on the experimental validation. The full manuscript includes a comparative analysis on standard query-oriented summarization datasets (DUC 2005/2006), against lexical graph and hypergraph baselines, using content coverage metrics such as ROUGE with reported statistical significance. We will revise the abstract to briefly reference the datasets, baselines, and metrics to better support the claim. revision: yes
Referee: [Method] Method section (topic model and fuzzy hypergraph construction): the claim that HDP-derived topic distributions supply meaningful fuzzy hyperedges for lexically dissimilar but semantically related sentences is the key modeling assumption, but no diagnostic (sentence-pair correlation with human judgments, embedding similarities, or ablation versus TF-IDF) is referenced; if this assumption fails on short sentences, the fuzzy hypergraph reduces to a noisier lexical graph and the superiority claim does not follow.

Authors: The empirical superiority over lexical baselines in the experiments provides indirect support for the semantic value of the HDP topics. However, we acknowledge that direct diagnostics would strengthen the presentation of the modeling assumption. We will add an ablation comparing the fuzzy hypergraph to a TF-IDF variant and include analysis of topic-based similarities versus embedding similarities in the revised version. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external HDP and submodular theory

full rationale

The paper's central derivation applies the standard Hierarchical Dirichlet Process (an established external model) to obtain sentence topic distributions, then defines fuzzy hyperedges as those topics and solves a submodular maximization problem using known polynomial-time algorithms from submodular function theory. Neither step reduces to a quantity defined by the authors' own prior equations or self-citations; the fuzzy hypergraph is presented as a modeling choice rather than a fitted prediction, and no load-bearing uniqueness theorem or ansatz is imported from the authors' own work. The abstract and described method remain self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim depends on the effectiveness of the HDP-inspired topic model for semantic grouping and the validity of the fuzzy hypergraph representation; these are introduced without independent verification in the abstract.

free parameters (1)

HDP topic model parameters
Parameters of the Hierarchical Dirichlet Process are learned from the corpus data to produce sentence topic distributions.

axioms (1)

domain assumption Each topic defines a semantic connection among a group of sentences with a certain degree of membership for each sentence
This premise directly motivates representing topics as fuzzy hyperedges in the model.

invented entities (1)

fuzzy hypergraph with topics as hyperedges no independent evidence
purpose: To represent semantic similarities between sentences for summarization optimization
Proposed as the core new structure in the paper; no external falsifiable evidence is mentioned in the abstract.

pith-pipeline@v0.9.0 · 5752 in / 1353 out tokens · 32110 ms · 2026-05-25T18:13:18.499754+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · 1 internal anchor

[1]

Aiyar et al., The Refugee Surge in Europe: Economic Challenges, IMF Staﬀ Dis- cussion Note, International Monetary Fund, January 2016, p

S. Aiyar et al., The Refugee Surge in Europe: Economic Challenges, IMF Staﬀ Dis- cussion Note, International Monetary Fund, January 2016, p. 4, retrieved on August 31, 2017, from https://www.imf.org/external/pubs/ft/sdn/2016/sdn1602.pdf

work page 2016
[2]

Arora, B

R. Arora, B. Ravindran, Latent dirichlet allocation based multi-document summa- rization, In: Proc. of AND 2008 , ACM, Singapore, Singapore, 2008, pp. 91-97

work page 2008
[3]

D. S. Bershtein, A. V. Bozhenyuk, Fuzzy graphs and fuzzy hypergraphs, Encyclope- dia of Artiﬁcial Intelligence, IGI Global, Hershey, PA, 2009, pp. 704-709

work page 2009
[4]

Blake, A comparison of document, sentence, and term event spaces, In: Proc

C. Blake, A comparison of document, sentence, and term event spaces, In: Proc. of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics , ACL, 2006, pp. 601-608

work page 2006
[5]

D. M. Blei, A. Y. Ng, M. I. Jordan, Latent dirichlet allocation, Journal of machine Learning research, 3 (2003) 993-1022. 30

work page 2003
[6]

X. Cai, W. Li, Ranking through clustering: An integrated approach to multi- document summarization, IEEE Transactions on Audio, Speech, and Language Pro- cessing, 21 (7) (2013) 1424-1433

work page 2013
[7]

Cambridge online dictionary, Cambridge University Press , Cambridge, UK, 2017, retrieved at August 15, 2017

work page 2017
[8]

P. Connor, Number of Refugees to Europe Surges to Record 1.3 Million in 2015, Pew Research Center , Washington, D.C., August 2, 2016, retrieved on Au- gust 31, 2015, from http://www.pewglobal.org/2016/08/02/number-of-refugees-to- europe-surges-to-record-1-3-million-in-2015/

work page 2015
[9]

H. T. Dang, Overview or DUC 2005, In: Proc. of the document understanding con- ference, DUC 2005, Vancouver, Canada, 2005, pp. 1-12

work page 2005
[10]

H. T. Dang, Overview of the DUC 2007 summarization task, In: Proc. of the docu- ment understanding conference, DUC 2007, Rochester, NY, 2007

work page 2007
[11]

Erkan, D

G. Erkan, D. Radev, LexRank: graph-based centrality as salience in text summa- rization, Journal of Artiﬁcial Intelligence Research , 22 (2004) 457-479

work page 2004
[12]

A. Fazekas, How to See the Best Total Solar Eclipse in a Century, National Geographic , June 9, 2017, retrieved from https://news.nationalgeographic.com/2017/06/total-solar-eclipse-august-how- watch-science/

work page 2017
[13]

Garey, D

M. Garey, D. S. Johnson, Computers and intractability, vol. 29, W. H. Freeman & Co, New York, NY, 2002

work page 2002
[14]

Y. Gong, X. Liu, Generic text summarization using relevance measure and latent semantic analysis, In: Proc. of SIGIR 2001 , ACM, New Orleans, LA, 2001, pp. 19-25

work page 2001
[15]

T. Hale, Today’s Eclipse Will Actually Be Visible From The UK And Europe - Here’s How To See It, IFL Science , August 21, 2017, re- trieved from http://www.iﬂscience.com/space/dont-worry-europe-you-too-should- be-able-to-enjoy-the-eclipse/

work page 2017
[16]

Hennig, D

L. Hennig, D. A. I. Labor, Topic-based Multi-Document Summarization with Prob- abilistic Latent Semantic Analysis, In: RANLP 2009, Borovets, Bulgaria, 2009, pp. 144-149

work page 2009
[17]

T. D. Hoa, Overview or DUC 2006, In: Proc. of the document understanding con- ference, DUC 2006, New York, NY, 2006

work page 2006
[18]

Human Rights Watch, Europe’s Migration Crisis, HRW, 2017, retrieved on August 31, 2017, from https://www.hrw.org/tag/europes-migration-crisis

work page 2017
[19]

R. H. Li, J. X. Yu, Scalable diversiﬁed ranking on large graphs, IEEE Transactions on Knowledge and Data Engineering , 25(9) (2013) 2133-2146

work page 2013
[20]

Lin, E.H

C.-Y. Lin, E.H. Hovy, Automatic evaluation of summaries using n-gram co- occurrence Statistics, In: Proc. of HLT-NAACL 2003 , Edmonton, Canada, 2003, pp. 71-78. 31

work page 2003
[21]

H. Lin, J. Bilmes, Multi-document summarization via budgeted maximization of submodular functions, In: Proc. of HLT-NAACL 2010, Los Angeles, CA, 2010, pp. 912-920

work page 2010
[22]

Lloyd, Least squares quantization in PCM, IEEE transactions on information theory, 28(2) (1982) 129-137

S. Lloyd, Least squares quantization in PCM, IEEE transactions on information theory, 28(2) (1982) 129-137

work page 1982
[23]

Q. Mei, J. Guo, D. Radev, Divrank: the interplay of prestige and diversity in in- formation networks, In: Proc. of SIGKDD 2010 , ACM, Washington, DC, 2010, pp. 1009-1018

work page 2010
[24]

J. N. Mordeson, P. S. Nair, Fuzzy graphs and fuzzy hypergraphs, vol. 46, Studies in Fuzziness and Soft Computing , Springer, Berlin, Germany, 2012

work page 2012
[25]

National Aeronautics and Space Administration, How eclipses work, NASA, August 2017, retrieved from https://eclipse2017.nasa.gov/how-eclipses-work

work page 2017
[26]

Nenkova, K

A. Nenkova, K. McKeown, Automatic summarization, Foundations and Trends in Information Retrieval, 5.2-3 (2011) 103-233

work page 2011
[27]

Otterbacher, G

J. Otterbacher, G. Erkan, D. Radev, Using random walks for question-focused sen- tence retrieval, In: Proc. of HLT/EMNLP 2005 , Vancouver, Canada, 2005, pp. 915-922

work page 2005
[28]

D. G. Papademetriou, M. Sumption, W. Somerville, Migration and the Economic Downturn: What to Expect in the European Union, Migra- tion Policy Institute , Washington, D.C., January 2009, Retrieved on August 31, 2017, from https://www.migrationpolicy.org/research/migration-and-economic- downturn-what-expect-european-union

work page 2009
[29]

Pedregosa et al., Scikit-learn: Machine learning in Python, Journal of Machine Learning Research, 12 (2011) 2825-2830

F. Pedregosa et al., Scikit-learn: Machine learning in Python, Journal of Machine Learning Research, 12 (2011) 2825-2830

work page 2011
[30]

M. F. Porter, Snowball: A language for stemming algorithms, Available at: http://www.snowball.tartarus.org/texts/introduction.html, 2001

work page 2001
[31]

Portes, Immigration Is Good for Economic Growth

J. Portes, Immigration Is Good for Economic Growth. If Europe Gets It Right, Refugees Can Be Too., Huﬃngton Post , 2017, retrieved on August 31, 2017, from https://www.huﬃngtonpost.com/jonathan-portes/economic-europe- refugees b 8128288.html

work page 2017
[32]

Rokach, O

L. Rokach, O. Maimon, Clustering methods, Data mining and knowledge discovery handbook, Springer US, New York, NY, 2005, pp. 321-352

work page 2005
[33]

C. Shen, T. Li, Multi-document summarization via the minimum dominating set, In: Proc. of COLING 2010 , Beijing, China, 2010, pp. 984-992

work page 2010
[34]

Y. W. Teh, M. I. Jordan, M. J. Beal, D. M. Blei, Sharing clusters among related groups: Hierarchical Dirichlet processes, In: Advances in neural information pro- cessing systems, NIPS 2005, Vancouver, Canada, 2005, pp. 1385-1392

work page 2005
[35]

United Nations High Commissioner for Refugees, Insecurity, economic crisis, abuse and exploitation in Libya push refugees and migrants to Europe, UNHCR, July 3, 2017, retrieved on August 31, 2017, from http://www.unhcr.org/afr/news/press/2017/7/595a03bb4/insecurity-economic- crisis-abuse-exploitation-libya-push-refugees-migrants.html. 32

work page 2017
[36]

Wan, Subtopic-based multimodality ranking for topic-focused multidocument summarization, Computational Intelligence, 29(4) (2013) 627-648

X. Wan, Subtopic-based multimodality ranking for topic-focused multidocument summarization, Computational Intelligence, 29(4) (2013) 627-648

work page 2013
[37]

X. Wan, J. Yang, Multi-document summarization using cluster-based link analysis, In: Proc. of SIGIR 2008 , ACM, Singapore, Singapore, 2008, pp. 299-306

work page 2008
[38]

C. Wang, D. M. Blei, A split-merge MCMC algorithm for the hierarchical Dirichlet process, arXiv preprint arXiv:1201.1657, 2012

work page internal anchor Pith review Pith/arXiv arXiv 2012
[39]

W. Wang, S. Li, J. Li, W. Li, F. Wei, Exploring hypergraph-based semi-supervised ranking for query-oriented summarization,Information Sciences, 237 (2013) 271-286

work page 2013
[40]

F. Wei, W. Li, Q. Lu, Y. He, A document-sensitive graph model for multi-document summarization, Knowledge and Information Systems , 22 (2) (2010) 245-259

work page 2010
[41]

Xiong, D

S. Xiong, D. Ji, Query-focused multi-document summarization using hypergraph- based ranking, Information Processing and Management , 52 (4) (2016), 670-681

work page 2016
[42]

W. Yin, Y. Pei, Optimizing Sentence Modeling and Selection for Document Sum- marization, In: Proc. of IJCAI 2015, Buenos Aires, Argentina, 2015, pp. 1383-1389

work page 2015
[43]

Zhang, S

Z. Zhang, S. S. Ge, H. He, Mutual-reinforcement document summarization using embedded graph based sentence clustering for storytelling, Information Processing and Management, 48 (4) (2012) 767-778. 33

work page 2012

[1] [1]

Aiyar et al., The Refugee Surge in Europe: Economic Challenges, IMF Staﬀ Dis- cussion Note, International Monetary Fund, January 2016, p

S. Aiyar et al., The Refugee Surge in Europe: Economic Challenges, IMF Staﬀ Dis- cussion Note, International Monetary Fund, January 2016, p. 4, retrieved on August 31, 2017, from https://www.imf.org/external/pubs/ft/sdn/2016/sdn1602.pdf

work page 2016

[2] [2]

Arora, B

R. Arora, B. Ravindran, Latent dirichlet allocation based multi-document summa- rization, In: Proc. of AND 2008 , ACM, Singapore, Singapore, 2008, pp. 91-97

work page 2008

[3] [3]

D. S. Bershtein, A. V. Bozhenyuk, Fuzzy graphs and fuzzy hypergraphs, Encyclope- dia of Artiﬁcial Intelligence, IGI Global, Hershey, PA, 2009, pp. 704-709

work page 2009

[4] [4]

Blake, A comparison of document, sentence, and term event spaces, In: Proc

C. Blake, A comparison of document, sentence, and term event spaces, In: Proc. of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics , ACL, 2006, pp. 601-608

work page 2006

[5] [5]

D. M. Blei, A. Y. Ng, M. I. Jordan, Latent dirichlet allocation, Journal of machine Learning research, 3 (2003) 993-1022. 30

work page 2003

[6] [6]

X. Cai, W. Li, Ranking through clustering: An integrated approach to multi- document summarization, IEEE Transactions on Audio, Speech, and Language Pro- cessing, 21 (7) (2013) 1424-1433

work page 2013

[7] [7]

Cambridge online dictionary, Cambridge University Press , Cambridge, UK, 2017, retrieved at August 15, 2017

work page 2017

[8] [8]

P. Connor, Number of Refugees to Europe Surges to Record 1.3 Million in 2015, Pew Research Center , Washington, D.C., August 2, 2016, retrieved on Au- gust 31, 2015, from http://www.pewglobal.org/2016/08/02/number-of-refugees-to- europe-surges-to-record-1-3-million-in-2015/

work page 2015

[9] [9]

H. T. Dang, Overview or DUC 2005, In: Proc. of the document understanding con- ference, DUC 2005, Vancouver, Canada, 2005, pp. 1-12

work page 2005

[10] [10]

H. T. Dang, Overview of the DUC 2007 summarization task, In: Proc. of the docu- ment understanding conference, DUC 2007, Rochester, NY, 2007

work page 2007

[11] [11]

Erkan, D

G. Erkan, D. Radev, LexRank: graph-based centrality as salience in text summa- rization, Journal of Artiﬁcial Intelligence Research , 22 (2004) 457-479

work page 2004

[12] [12]

A. Fazekas, How to See the Best Total Solar Eclipse in a Century, National Geographic , June 9, 2017, retrieved from https://news.nationalgeographic.com/2017/06/total-solar-eclipse-august-how- watch-science/

work page 2017

[13] [13]

Garey, D

M. Garey, D. S. Johnson, Computers and intractability, vol. 29, W. H. Freeman & Co, New York, NY, 2002

work page 2002

[14] [14]

Y. Gong, X. Liu, Generic text summarization using relevance measure and latent semantic analysis, In: Proc. of SIGIR 2001 , ACM, New Orleans, LA, 2001, pp. 19-25

work page 2001

[15] [15]

T. Hale, Today’s Eclipse Will Actually Be Visible From The UK And Europe - Here’s How To See It, IFL Science , August 21, 2017, re- trieved from http://www.iﬂscience.com/space/dont-worry-europe-you-too-should- be-able-to-enjoy-the-eclipse/

work page 2017

[16] [16]

Hennig, D

L. Hennig, D. A. I. Labor, Topic-based Multi-Document Summarization with Prob- abilistic Latent Semantic Analysis, In: RANLP 2009, Borovets, Bulgaria, 2009, pp. 144-149

work page 2009

[17] [17]

T. D. Hoa, Overview or DUC 2006, In: Proc. of the document understanding con- ference, DUC 2006, New York, NY, 2006

work page 2006

[18] [18]

Human Rights Watch, Europe’s Migration Crisis, HRW, 2017, retrieved on August 31, 2017, from https://www.hrw.org/tag/europes-migration-crisis

work page 2017

[19] [19]

R. H. Li, J. X. Yu, Scalable diversiﬁed ranking on large graphs, IEEE Transactions on Knowledge and Data Engineering , 25(9) (2013) 2133-2146

work page 2013

[20] [20]

Lin, E.H

C.-Y. Lin, E.H. Hovy, Automatic evaluation of summaries using n-gram co- occurrence Statistics, In: Proc. of HLT-NAACL 2003 , Edmonton, Canada, 2003, pp. 71-78. 31

work page 2003

[21] [21]

H. Lin, J. Bilmes, Multi-document summarization via budgeted maximization of submodular functions, In: Proc. of HLT-NAACL 2010, Los Angeles, CA, 2010, pp. 912-920

work page 2010

[22] [22]

Lloyd, Least squares quantization in PCM, IEEE transactions on information theory, 28(2) (1982) 129-137

S. Lloyd, Least squares quantization in PCM, IEEE transactions on information theory, 28(2) (1982) 129-137

work page 1982

[23] [23]

Q. Mei, J. Guo, D. Radev, Divrank: the interplay of prestige and diversity in in- formation networks, In: Proc. of SIGKDD 2010 , ACM, Washington, DC, 2010, pp. 1009-1018

work page 2010

[24] [24]

J. N. Mordeson, P. S. Nair, Fuzzy graphs and fuzzy hypergraphs, vol. 46, Studies in Fuzziness and Soft Computing , Springer, Berlin, Germany, 2012

work page 2012

[25] [25]

National Aeronautics and Space Administration, How eclipses work, NASA, August 2017, retrieved from https://eclipse2017.nasa.gov/how-eclipses-work

work page 2017

[26] [26]

Nenkova, K

A. Nenkova, K. McKeown, Automatic summarization, Foundations and Trends in Information Retrieval, 5.2-3 (2011) 103-233

work page 2011

[27] [27]

Otterbacher, G

J. Otterbacher, G. Erkan, D. Radev, Using random walks for question-focused sen- tence retrieval, In: Proc. of HLT/EMNLP 2005 , Vancouver, Canada, 2005, pp. 915-922

work page 2005

[28] [28]

D. G. Papademetriou, M. Sumption, W. Somerville, Migration and the Economic Downturn: What to Expect in the European Union, Migra- tion Policy Institute , Washington, D.C., January 2009, Retrieved on August 31, 2017, from https://www.migrationpolicy.org/research/migration-and-economic- downturn-what-expect-european-union

work page 2009

[29] [29]

Pedregosa et al., Scikit-learn: Machine learning in Python, Journal of Machine Learning Research, 12 (2011) 2825-2830

F. Pedregosa et al., Scikit-learn: Machine learning in Python, Journal of Machine Learning Research, 12 (2011) 2825-2830

work page 2011

[30] [30]

M. F. Porter, Snowball: A language for stemming algorithms, Available at: http://www.snowball.tartarus.org/texts/introduction.html, 2001

work page 2001

[31] [31]

Portes, Immigration Is Good for Economic Growth

J. Portes, Immigration Is Good for Economic Growth. If Europe Gets It Right, Refugees Can Be Too., Huﬃngton Post , 2017, retrieved on August 31, 2017, from https://www.huﬃngtonpost.com/jonathan-portes/economic-europe- refugees b 8128288.html

work page 2017

[32] [32]

Rokach, O

L. Rokach, O. Maimon, Clustering methods, Data mining and knowledge discovery handbook, Springer US, New York, NY, 2005, pp. 321-352

work page 2005

[33] [33]

C. Shen, T. Li, Multi-document summarization via the minimum dominating set, In: Proc. of COLING 2010 , Beijing, China, 2010, pp. 984-992

work page 2010

[34] [34]

Y. W. Teh, M. I. Jordan, M. J. Beal, D. M. Blei, Sharing clusters among related groups: Hierarchical Dirichlet processes, In: Advances in neural information pro- cessing systems, NIPS 2005, Vancouver, Canada, 2005, pp. 1385-1392

work page 2005

[35] [35]

United Nations High Commissioner for Refugees, Insecurity, economic crisis, abuse and exploitation in Libya push refugees and migrants to Europe, UNHCR, July 3, 2017, retrieved on August 31, 2017, from http://www.unhcr.org/afr/news/press/2017/7/595a03bb4/insecurity-economic- crisis-abuse-exploitation-libya-push-refugees-migrants.html. 32

work page 2017

[36] [36]

Wan, Subtopic-based multimodality ranking for topic-focused multidocument summarization, Computational Intelligence, 29(4) (2013) 627-648

X. Wan, Subtopic-based multimodality ranking for topic-focused multidocument summarization, Computational Intelligence, 29(4) (2013) 627-648

work page 2013

[37] [37]

X. Wan, J. Yang, Multi-document summarization using cluster-based link analysis, In: Proc. of SIGIR 2008 , ACM, Singapore, Singapore, 2008, pp. 299-306

work page 2008

[38] [38]

C. Wang, D. M. Blei, A split-merge MCMC algorithm for the hierarchical Dirichlet process, arXiv preprint arXiv:1201.1657, 2012

work page internal anchor Pith review Pith/arXiv arXiv 2012

[39] [39]

W. Wang, S. Li, J. Li, W. Li, F. Wei, Exploring hypergraph-based semi-supervised ranking for query-oriented summarization,Information Sciences, 237 (2013) 271-286

work page 2013

[40] [40]

F. Wei, W. Li, Q. Lu, Y. He, A document-sensitive graph model for multi-document summarization, Knowledge and Information Systems , 22 (2) (2010) 245-259

work page 2010

[41] [41]

Xiong, D

S. Xiong, D. Ji, Query-focused multi-document summarization using hypergraph- based ranking, Information Processing and Management , 52 (4) (2016), 670-681

work page 2016

[42] [42]

W. Yin, Y. Pei, Optimizing Sentence Modeling and Selection for Document Sum- marization, In: Proc. of IJCAI 2015, Buenos Aires, Argentina, 2015, pp. 1383-1389

work page 2015

[43] [43]

Zhang, S

Z. Zhang, S. S. Ge, H. He, Mutual-reinforcement document summarization using embedded graph based sentence clustering for storytelling, Information Processing and Management, 48 (4) (2012) 767-778. 33

work page 2012