| Markos
Mylonakis and Khalil Sima'an. Learning
Hierarchical Translation Structure with Linguistic
Annotations. In the
Proceedings
of The 49th Annual
Meeting of the Association for Computational
Linguistics: Human Language Technologies (ACL:HLT 2011). Feb 2011. |
Syntactic statistical
Machine Translation; Estimation; Learning
reordering
|
|
| Hany Hassan,
Khalil
Sima'an and Andy Way. A Morpho-Syntactically
Enriched Direct
Translation Model with Efficient Decoding. Machine
Translation Journal, Springer, 2011. |
Syntactic statistical
Machine Translation; Incremental decoding and
parsing
|
|
Markos
Mylonakis and Khalil Sima'an. Learning
Probabilistic Synchronous
CFGs for Phrase Translation Models. In Proceedings
of
the Fourteenth Conference on Computational
Natural Language Learning
(CoNLL 2010), Uppsala, Sweden,
July 2010
|
Statistical Machine
Translation; Estimation; Learning reordering |
pdf |
| Reut
Tsarfaty, Khalil Sima'an and Remko
Scha. Evaluating
an
Alternative to Head-Driven Approaches to
Parsing a (Relatively) Free
Word-Order Language. In Proceedings of
the Conference
on Empirical Methods in NLP (EMNLP'09),
Singapore. |
Statistical Parsing;
Morphology-syntax interface
|
pdf
|
| Hany Hassan,
Khalil Sima'an
and Andy Way. A Syntactified Direct Translation
Model with Linear-Time
Decoding. In Proceedings of the Conference
on Empirical Methods in NLP (EMNLP'09),
Singapore. |
Syntactic Machine
Translation; Incremental decoding and parsing |
pdf |
Hany
Hassan, Khalil Sima'an and
Andy Way. Syntactically
Lexicalized
Phrase-Based
Statistical Translation. In IEEE Transactions
on Audio, Speech and Language Processing,
Volume 16, Number 7.
September 2008.
|
Syntactic Machine
Translation; Incremental decoding and parsing |
pdf |
| Barbara
Plank and Khalil Sima'an. Parsing with
Subdomain Instance Weighting from Raw Corpora.
In proceedings Interspeech
2008, Australia, Sep. 2008. |
Subdomain; Domain
Adaptation; Inference from unannotated data.
|
pdf |
Markos
Mylonakis and Khalil Sima'an. Phrase Translation
Probabilities with ITG Priors and Smoothing as
Learning Objective. In
Proceedings of the Conference
on Empirical Methods in NLP (EMNLP'08),
2008.
|
Statistical Machine
Translation; Estimation; Learning |
pdf |
Reut
Tsarfaty
and Khalil Sima'an. Relational
Realizational
Parsing. In
proceedings COLING
2008,
Manchester, UK, August 2008.
|
Statistical Parsing;
Morphology-syntax interface |
pdf |
| Roy
Bar-Haim, Khalil
Sima'an and Yoad Winter. Part-of-Speech
Tagging
of
Modern
Hebrew
Text.
In Journal
of
Natural
Language
Engineering
(J-NLE),
14(2):223-251, 2008.
|
HMM-based Morphological
disambiguation for Hebrew and Arabic
|
pdf |
Markos
Mylonakis, Khalil Sima'an and
Rebecca
Hwa. Unsupervised
Estimation for Noisy-Channel Models . In Proceedings
24th Annual International
Conference on
Machine Learning (ICML 2007).
|
Learning lexicon
probabilities from non-parallel data
|
pdf
|
Hany
Hassan, Khalil Sima'an
and Andy Way.
Supertagged
Phrase-Based Statistical Machine Translation. In
Proceedings 45th
Annual
Meeting
of
the
Assoc.
for Comp. Linguistics, Prague,
2007 (ACL 2007).
|
Syntactic Machine
Translation; Incremental decoding and parsing |
pdf
|
Andreas
Zollmann
and Khalil Sima'an. A
Consistent and Efficient Estimator for
Data-Oriented Parsing. Journal
of Automata, Languages and
Combinatorics (JALC) Vol. 10
(2005)
Number 2/3, pages 367-388.
Presents a
consistent estimator for DOP with a proof of
consistency.
|
Consistent estimators for
DOP
|
pdf |
| Khalil
Sima'an.
Robust
Data-Oriented
Understanding
of
Spoken
Utterances.
In
H.
Bunt, J.
Carroll and
G. Satta
(eds.), New
Developments
in Parsing
Technologies,
pages 323-338,
Kluwer
(2004). |
Speech understanding;
Update Semantics; Statistical Parsing
|
pdf |
Khalil
Sima'an and
Luciano
Buratto. Backoff
Parameter Estimation for the DOP Model.
In
Proceedings
of
the
European Conference on Machine Learning
(ECML'03), N.
Lavrac, D. Gamberger, H.
Blockeel and L. Todorovski (ed.). Lecture
Notes
in Artificial
Intelligence (LNAI
2837), pages 373-384,
Springer, 2003.
|
Consistent estimators for
DOP |
pdf |
Khalil
Sima'an. On
Maximizing Metrics for Syntactic Disambiguation. In
Proceedings of the International Workshop
on Parsing Technologies (IWPT'03). Nancy,
France, April 2003.
Presents among
others a Minimum-Bayes Risk decoding
algorithm referred to by others as the
MAX-RULE-SUM (see,
e.g., Bansal
and Klein 2010, ..., Petrov et al
2007
, Matsuzaki
et
al 2005), or the Maximum Expected CFG
Rule count
algorithm
(see, e.g.,
Cohn
et
al
2008).
|
Minimum-Bayes Risk
decoding for statistical parsing models
|
pdf |
| Rend Bod,
Remko
Scha and Khalil
Sima'an (editors). Data-Oriented
Parsing. Studies
in Computational Linguistics, CSLI
Publications, University of Chicago Press,
2003. |
Data-Oriented Parsing
|
|
| Khalil
Sima'an. Computational
Complexity of Probabilistic
Disambiguation: NP-Completeness
Results for Language and Speech
Processing. In Grammars,
Volume
5(2),
Kluwer Publishers, 2002. |
Computational Complexity
of statistical disambiguation
|
pdf |
| Khalil
Sima'an, A.
Itai, Y.
Winter, A.
Altman and N.
Nativ.
Building
a
Tree-Bank of
Modern Hebrew
Text.
In
Beatrice
Daille
and
Laurent
Romary
(eds.),
Journal
Traitement
Automatique
des
Langues
(T.A.L) ,
2001. Special
Issue on
Natural
Language
Processing and
Corpus
Linguistics. |
Hebrew treebanking;
Statistical Parsing
|
pdf |
Khalil
Sima'an.
Tree-gram
Parsing:
Lexical
Dependencies
and Structural
Relations
Proceedings
of
38th
Annual
Meeting
of
the
Association
for
Computational
Linguistics (ACL'00)
,
Hong
Kong, China,
2000.
Content:
Presents a
novel model
for parsing
that combines
the strengths
of DOP with
those of
bilexical-dependency
models (Charniak
1999; Collins
1997),
including
head-binarized
``subtrees"
(Tree-grams)
with label
splitting by
parent-encoding
(Johnson
1999) and
head
pre-terminals.
The
implementation
employs a
coarse-to-fine
parser (PCFG
then
Tree-gram).
See Mohit
Bansal and Dan Klein 2010
for an exploration extending this model
with compact Goodman representations,
optimized parameter settings and
state-of-the-art parsing results for
different languages.
|
Treegrams; Statistical
Parsing; Horizonal-Markov DOP; Lexicalized DOP
|
pdf |
| Khalil
Sima'an.
Efficient
Disambiguation
by means of
Stochastic
Tree
Substitution
Grammars. In
New
Methods in
Language
Processing .
D. Jones and
H.
Somers
(editors), UCL
Press, UK,
1997. |
Statistical Parsing
|
pdf |
| Remko
Scha,
Rens
Bod
and
Khalil
Sima'an.
A
Memory-Based
Model of
Syntactic
Analysis:
Data-Oriented
Parsing.
In special
Issue
on
Memory-Based
Processing, W.
Daelemans
(ed.), Journal
of Empirical
and
Theoretical
Artificial
Intelligence
(JETAI), 11
(3), 1999.
|
Data-Oriented Parsing;
Memory-based models
|
pdf |
| Khalil
Sima'an.
Explanation-Based
Learning
of
Data-Oriented
Parsing.
In Proceedings
of the
Conference
on
Computational
Natural
Language
Learning
(CoNLL),
jointly with
ACL/EACL-97,
Madrid,
Spain, July
1997. The
first
publication
aiming at
learning the
set of
fragments for
a DOP model
from a
treebank.
|
Statistical learning of
compact DOP models
|
pdf |
Khalil
Sima'an.
Computational
Complexity
of
Probabilistic
Disambiguation
by means of
Tree Grammars.In
Proceedings
of the
International
Conference on
Computational
Linguistics
(COLING '96),
pp.1175-1180
(vol.
2),
Copenhagen,
Denmark,
August 1996.
Content:
Presents a
proof of
NP-Completeness
for a set of
related
problems
including
computing the
highest
probability
parse for an
input sentence
under
probabilistic
tree-subsitution
grammars
(PTSGs),
computing the
highest
probability
string from an
input lattice
under a
probabilistic
context-free
grammar
(PCFG) and
under PTSG.
The latter two
problems are
the problems
of
``decoding" in
speech
recognition
and machine
translation
when the
(target)
language model
is a PCFG or
PTSG.
|
Computational complexity
of statistical disambiguation
|
pdf |