From: acsinger@uiuc.edu
Sent: Sunday, October 12, 2003 9:52 PM
To: Joseph Picone
Cc: k.uherek@ieee.org; acsinger@uiuc.edu
Subject: ((RQ)) IEEE Transactions on Signal Processing - T-SP-01444-2003

Oct 12, 2003

Dr. Joseph Picone
MS State
Box 9571

MS State,MS 39762

Paper:T-SP-01444-2003  Applications of Risk Minimization to
Speech Recognition 

Dear Dr. Picone,

I am writing to you concerning the above referenced manuscript,
which you submitted to the IEEE Transactions on Signal
Processing. 

Based on the enclosed set of reviews, it was recommended that
the manuscript be REVISED AND RESUBMITTED (RQ).  I would like to
call your attention to the specific concerns of the reviewers as
to the novelty of the results and the specific contributions
vis-a-vis the literature on SVMs.

We hope you will be able to implement the comments of the
reviewers (**See note below about attachments).  Please note
that the reviewers will review your revised article only one
more time. The only decisions available to an AE after a second
round of reviews are A(Accepted), AQ(Accepted with Mandatory
Revisions), and R(Reject).

Your revised manuscript must be submitted back to Manuscript
Central http://sps-ieee.manuscriptcentral.com no later than 60
days from the date of this letter to be further considered for
publication in the IEEE Transactions on Signal Processing. If we
do not receive your revised manuscript within this specified
time, your manuscript will be considered withdrawn.  Please be
sure to upload the revised manuscript in Dr. Picone's account. 
That is the account that will have the number T-SP-01444-2003R1
listed.  Once you see this number simply click on the underlined
title and follow the submission guidelines.  If you do not see
this number please send an email to Mr. Kevin Uherek at
k.uherek@ieee.org and he will assist you in finding this
number.

If you have any questions regarding the reviews, please contact
me.  Any other inquiries should be directed to Mr. Kevin
Uherek.


Regards,

Dr. Andrew Singer  
acsinger@uiuc.edu  

Mr. Kevin Uherek
k.uherek@ieee.org
SPS Publications Office

**Any ATTACHMENTS FROM THE REVIEWERS can be found by you, (Dr.
Picone) going to the paper in your account.  On the right side
of the screen you will see a button "View Comments/Respond". 
Click on it.  You will then see the same email sent to you from
Dr. Andrew Singer but the attachments from reviewers are listed
at the very bottom.
  
http://sps-ieee.manuscriptcentral.com 
Reviewer 1 Comments:
I. REVIEW 

Please expand and give details in Section III.

A. Suitability of topic

1.  Is the topic appropriate for publication in these
transactions?

(X) Yes

( ) Perhaps

( ) No



2.  Is the topic important to colleagues working in the field?

(X) Yes

( ) Moderately So

( ) No (explain: )


B. Content
1.  Is the paper technically sound?
(X) Yes
( ) No (why not? )</P>


2.  Is the coverage of the topic sufficiently comprehensive and
balanced?

( ) Yes

(X) Important information is missing or superficially treated.

( ) Treatment somewhat unbalanced, but not seriously so.

( ) Certain parts significantly overstresses.



3. How would you describe the technical depth of the paper?

( ) Superficial

( ) Suitable for the non-specialist

( ) Appropriate for the Generally Knowledgeable Individual
Working in the Field or a Related Field

(X) Suitable only for an Expert



4. How would you rate the technical novelty of the paper?

( ) Novel

(X) Somewhat Novel

( ) Not Novel



C. Presentation

1. How would you rate the overall organization of the paper?

(X) Satisfactory

( ) Could be improved

( ) Poor



2.  Are the title and abstract satisfactory?

(X) Yes

( ) No (explain: )



3. Is the length of the paper appropriate?  If not, recommend
how the length of the paper should be amended, including a
possible target length for the final manuscript.

(X) Yes

( ) No



4. Are symbols, terms, and concepts adequately defined?

(X) Yes

( ) Not always

( ) No



5. How do you rate the English usage?

(X) Satisfactory

( ) Needs improvement

( ) Poor


6. Rate the Bibliography.

(X) Satisfactory

( ) Unsatisfactory (explain: )



D. Overall rating

1.  How would you rate the technical contents of the paper?

( ) Excellent

(X) Good

( ) Fair

( ) Poor



2.  How would you rate the novelty of the paper?

( ) Highly Novel

(X) Sufficiently Novel

( ) Slightly Novel

( ) Not Novel



3. How would you rate the literary presentation of the paper?

(X) Totally Accessible

( ) Mostly Accessible

( ) Partially Accessible

( ) Inaccessible



4. How would you rate the appropriateness of this paper for
publication in this IEEE Transactions?

(X) Excellent Match

( ) Good Match

( ) Weak Match

( ) Poor Match



II. RECOMMENDATION

( ) A  Publish Unaltered

( ) AQ Publish in Minor, Required Changes (as noted in Section
III)

(X) RQ Review Again After Major Changes (as noted in Section
III)

( ) R  Reject (Paper is not of sufficient quality or novelty to
be published in this Transactions)

( ) R  Reject (A major rewrite is required.  Author should be
encouraged to resubmit rewritten paper at some later time.)

( ) R  Reject (Paper is seriously flawed; do not encourage
resubmission.)



III. DETAILED COMMENTS

Please state why you rated the paper as you did in Sections I
and II.  If you have indicated that revisions are required,
please give the author specific guidance regarding those
revisions, differentiating between optional and mandatory
changes.

1) Good tutorial on the evolution of pattern recognition
techniques as applied to speech recognition.  Good description
of the SVM and reasons for its use.  The latter may not be well
known in the signal processing community.
2) Inadequate explanation and justification for eq. (13).  This
only gives local information about the observable pdfs near the
decision boundaries.
3) Inadequate explanation of the overall recognition system.  I
assume that it's a standard IBM style model but that must be
made clear and explicit.
4) Inadequate evaluation of the results.  How can we conclude
that there is a performance improvement.  Table 1 is not
enough.
5) Inadequate discussion of the pitfalls of the SVM when
projecting into very high dimensional feature spaces.Reviewer 2
Comments:
I. REVIEW 

Please expand and give details in Section III.

A. Suitability of topic

1.  Is the topic appropriate for publication in these
transactions?

(X) Yes

( ) Perhaps

( ) No



2.  Is the topic important to colleagues working in the field?

(X) Yes

( ) Moderately So

( ) No (explain: )


B. Content
1.  Is the paper technically sound?
(X) Yes
( ) No (why not? )</P>


2.  Is the coverage of the topic sufficiently comprehensive and
balanced?

( ) Yes

(X) Important information is missing or superficially treated.

( ) Treatment somewhat unbalanced, but not seriously so.

( ) Certain parts significantly overstresses.



3. How would you describe the technical depth of the paper?

( ) Superficial

( ) Suitable for the non-specialist

( ) Appropriate for the Generally Knowledgeable Individual
Working in the Field or a Related Field

(X) Suitable only for an Expert



4. How would you rate the technical novelty of the paper?

( ) Novel

(X) Somewhat Novel

( ) Not Novel



C. Presentation

1. How would you rate the overall organization of the paper?

( ) Satisfactory

(X) Could be improved

( ) Poor



2.  Are the title and abstract satisfactory?

(X) Yes

( ) No (explain: )



3. Is the length of the paper appropriate?  If not, recommend
how the length of the paper should be amended, including a
possible target length for the final manuscript.

(X) Yes

( ) No



4. Are symbols, terms, and concepts adequately defined?

( ) Yes

(X) Not always

( ) No



5. How do you rate the English usage?

( ) Satisfactory

(X) Needs improvement

( ) Poor


6. Rate the Bibliography.

(X) Satisfactory

( ) Unsatisfactory (explain: )



D. Overall rating

1.  How would you rate the technical contents of the paper?

( ) Excellent

( ) Good

(X) Fair

( ) Poor



2.  How would you rate the novelty of the paper?

( ) Highly Novel

(X) Sufficiently Novel

( ) Slightly Novel

( ) Not Novel



3. How would you rate the literary presentation of the paper?

( ) Totally Accessible

(X) Mostly Accessible

( ) Partially Accessible

( ) Inaccessible



4. How would you rate the appropriateness of this paper for
publication in this IEEE Transactions?

( ) Excellent Match

(X) Good Match

( ) Weak Match

( ) Poor Match



II. RECOMMENDATION

( ) A  Publish Unaltered

( ) AQ Publish in Minor, Required Changes (as noted in Section
III)

(X) RQ Review Again After Major Changes (as noted in Section
III)

( ) R  Reject (Paper is not of sufficient quality or novelty to
be published in this Transactions)

( ) R  Reject (A major rewrite is required.  Author should be
encouraged to resubmit rewritten paper at some later time.)

( ) R  Reject (Paper is seriously flawed; do not encourage
resubmission.)



III. DETAILED COMMENTS

Please state why you rated the paper as you did in Sections I
and II.  If you have indicated that revisions are required,
please give the author specific guidance regarding those
revisions, differentiating between optional and mandatory
changes.

Overall comments:

The paper presents some new ideas for SVM application to speech
recognition but requires substantial improvement in overall
presentation.
Too much space is devoted to introductory or review material;
not enough is devoted to the specific proposals made by the
authors.
The writing style is a bit too cavalier and needs to be
improved.
However, the ideas described and tested appear promising. With
substantial re-writing, I think this paper would be suitable
for
presentation in IEEE transactions.

Specific comments:

p. 1. "Simplistic techniques...": this derogates early work
in recognition and is an example of what I call a cavalier or
immodest
writing style. I suggest "simple techniques" instead.

"relatively fragile" --> relative to what? Drop "relative", and
be
more specific in describing system fragility.

"we demonstrate this" --> instead "that" between last 2 words.

p. 2-4. A bit too much review of HMM/EM fundamentals. Condense.

p. 3. 

"based as an optimization" --> "based ON an optimization"?

"in this example any amount of effort ... will not" --> "in this
example
NO amount of effort".

p. 4. "The primary different between ML-based HMM... the wrong
model
is used". This is a good point, but the phrasing is confusing.
Perhaps
something like "the objective criterion for the latter reflects
classification
performance even if the wrong probabilistic model is used"...?

p.5-8: This overview section seems slightly long.

p. 5. "Closed-loop optimality". I assume this refers to
performance
on the training set. I don't believe that "closed-loop" is a
common 
term in machine learning or speech recognition. 

"the design of a classifier is essentially a process" --> "the
design
of a classifier is essentially THE process"

p. 6, first paragraph introduces the term "loss function", but
then
this is re-introduced and italicized further down the page. 
Italicize/emphasize the term at its first use.

p. 8-10. This section summarizes SVMs, but this is all review
material.
Suggest condensing.

p. 11. On page 11 of a paper whose conclusion starts on page 16,

finally we're starting to hear details of what the authors did
in this study. I suggest adding one or two lines describing 
in more detail the ML-approach to estimating A and B of
the sigmoid.

"expense of increased computations" --> "expense of increased
computation time"

p. 12. "the phone 's'" --> "the phone /s/". However, the point
that the authors are trying to make is independent of the
phone identity, so I suggest not introducing the phone
identity.

p. 13. What do the authors mean exactly by "we use segments
composed
of the three sections"? I assume three equal-dimension vectors
are concatenated. What is the dimensionality of these vectors?
How are they themselves derived? The following text mentions
LPC (log-area parameters), but it's not clear if this is how
the
segments were calculated.

p. 14. "This was a good database" --> "This is a good database"

"We have observed similar trends on a number of static
classification
tasks" : details (or references) please. 

p. 16. "When we allow the SVM to decide the best segmentation
and
hypothesis combination by using the N-best segmentations..."
This point regarding the segmentations, either decided by the
baseline HMM system or by the SVMs, is very unclear. A lot more
detail
is needed on the authors' proposed technique. The authors have
provided
several figures describing the fundamentals of SVMs, but no
figures describing their own recognition system architecture. 
Is their method essentially an N-best rescoring method? Will
they
always require a baseline HMM to provide N-best lists, or
could the SVMs be used to supplant the HMM?

"Table 1" -- this should be Table 2, I think.

I strongly urge the authors to give slightly less details 
about review material that is already published, and many
more details on what they themselves did in this study.

















Reviewer 3 Comments:
I. REVIEW 

Please expand and give details in Section III.

A. Suitability of topic

1.  Is the topic appropriate for publication in these
transactions?

(X) Yes

( ) Perhaps

( ) No



2.  Is the topic important to colleagues working in the field?

(X) Yes

( ) Moderately So

( ) No (explain: )


B. Content
1.  Is the paper technically sound?
(X) Yes
( ) No (why not? )</P>


2.  Is the coverage of the topic sufficiently comprehensive and
balanced?

( ) Yes

(X) Important information is missing or superficially treated.

( ) Treatment somewhat unbalanced, but not seriously so.

( ) Certain parts significantly overstresses.



3. How would you describe the technical depth of the paper?

(X) Superficial

( ) Suitable for the non-specialist

( ) Appropriate for the Generally Knowledgeable Individual
Working in the Field or a Related Field

( ) Suitable only for an Expert



4. How would you rate the technical novelty of the paper?

( ) Novel

( ) Somewhat Novel

(X) Not Novel



C. Presentation

1. How would you rate the overall organization of the paper?

( ) Satisfactory

(X) Could be improved

( ) Poor



2.  Are the title and abstract satisfactory?

( ) Yes

(X) No (explain: see review)



3. Is the length of the paper appropriate?  If not, recommend
how the length of the paper should be amended, including a
possible target length for the final manuscript.

(X) Yes

( ) No



4. Are symbols, terms, and concepts adequately defined?

(X) Yes

( ) Not always

( ) No



5. How do you rate the English usage?

(X) Satisfactory

( ) Needs improvement

( ) Poor


6. Rate the Bibliography.

(X) Satisfactory

( ) Unsatisfactory (explain: )



D. Overall rating

1.  How would you rate the technical contents of the paper?

( ) Excellent

( ) Good

(X) Fair

( ) Poor



2.  How would you rate the novelty of the paper?

( ) Highly Novel

( ) Sufficiently Novel

(X) Slightly Novel

( ) Not Novel



3. How would you rate the literary presentation of the paper?

(X) Totally Accessible

( ) Mostly Accessible

( ) Partially Accessible

( ) Inaccessible



4. How would you rate the appropriateness of this paper for
publication in this IEEE Transactions?

( ) Excellent Match

( ) Good Match

(X) Weak Match

( ) Poor Match



II. RECOMMENDATION

( ) A  Publish Unaltered

( ) AQ Publish in Minor, Required Changes (as noted in Section
III)

( ) RQ Review Again After Major Changes (as noted in Section
III)

(X) R  Reject (Paper is not of sufficient quality or novelty to
be published in this Transactions)

( ) R  Reject (A major rewrite is required.  Author should be
encouraged to resubmit rewritten paper at some later time.)

( ) R  Reject (Paper is seriously flawed; do not encourage
resubmission.)



III. DETAILED COMMENTS

Please state why you rated the paper as you did in Sections I
and II.  If you have indicated that revisions are required,
please give the author specific guidance regarding those
revisions, differentiating between optional and mandatory
changes.

This paper proposes two deep, theoretically interesting research
topics, then drops both of them without warning.  In the end,
the experiments performed by the authors seem to be nothing more
than standard SVM classification experiments, performed almost
step by step according to the instructions in Vapnik's
textbook.

Section 2 discusses the well-known problems with ML, and
suggests MCE as a solution.  Section 3 discusses the principle
of structural risk minimization.  Combination of these concepts
would be very interesting, as MCE is itself a type of ERM;
unfortunately, both of these concepts are dropped without
warning, and the paper never returns to them.

Section 4 presents the support vector machine, and section 5
proposes an SVM/HMM hybrid.  An SVM/HMM hybrid would be very
interesting.  ANN/HMM hybrids use the classification power of an
ANN in order to improve the observation PDF computation of an
HMM.  As a result, the hybrid systems often outperform mixture
Gaussian HMMs, at the cost of substantially increased
computational complexity during training.  An SVM/HMM hybrid,
using SVM to compute the observation PDFs of an HMM in some way,
would presumably do even better than the ANN/HMM hybrid,
presumably at even greater computational cost.  Unfortunately,
this idea is dropped immediately after being suggested, because

it is too computationally complex.

In the end, the authors perform n-ary classification of
fixed-length segmental feature vectors.  As far as I can tell,
the only interaction between the HMM system and the SVM system
is that the HMM is used to find segment boundaries for the SVM
to classify; the final log-probabilities of the HMM and the SVM
are then apparently added together in order to create the final
recognition score.  In my view, adding together the scores of
two classifiers, both of which are trained and tested according
to standard published references, does not constitute a "hybrid"
system of sufficient novelty to merit publication.