Asiket Singh Dhillon
Neural Networking and pattern recognition
are a rising force in technological sector. Once we can teach machines to
‘learn’, the avenues that machines can be used in are ever expanding. Improved
neural networks could possibly yield higher efficiency rates in automatic
screening of individuals who may have cognitive and behavioural abnormalities.
Earlier this year Tulka Anhanai, Mohammed
Ghassemi and James Glass came together to present their paper titled Detecting Depression with Audio/Text
Sequence Modeling of Interviews. It puts forward a model of detecting
depression in individuals which rather than relying on traditional iterations
of uses, what they call “context-free” detection. Traditional models may
analyse a set of answers to predetermined questions. For example, “How often do
feel low/depressed?” / “Does your family have a history of any mental
diagnoses?”. The model knows what a depressed person would say and compares the
individual’s answers to see where they fit.
Al Hanai’s model analyses the content as
well as the manner of expression of answers. Therefore, it is not constrained
to the specific conversation but can utilise patterns recognised in other
individuals that may exhibit similar content and exhibition to reach its
assessment. Depressed individuals may speak slower and with lower voice
modulation in comparison to healthy individuals. The model recognises certain
patterns which emerge and then presents its diagnosis. This is known as
sequence modelling which allows it to not be constrained to a case-by-case
analysis and use recognisable patterns and apply those to a variety of
contexts. Sequencing allows for the analysis to not be restricted to pointed
questions and direct answers but rather for entire conversations and detect the
severity of depression as well.
Out of a total of 142 subjects that the
system analysed it labelled 28 individuals as those with depression. In terms
of precision the system scored 71%. Out of the 142 some were used as training
for the model and others for development which left 47 (25%) of the subjects
which the model had to assess. While this is far from ideal, it is a learning
step. The implications of such a model means that it would be possible to not
rely only on self-reports and manual analysis which may be hindered by the
assessor’s biases. The system will be able to analyse casual conversation and
provide assistance. Furthermore, individuals that cannot physically be present
at the practitioner’s office will also fall under the purview of the system.
Clinicians are then, no longer restricted to asking specific questions and direct
answers within specific context.
What the researchers found upon poring over
the results was that the system requires over 4 times more audio interactions
(30) to identify patterns as compared to text (7). I believe that this may be due to the fact
that audio would require far more categorisations than text. Such as
modulation, tone, pauses between words and tempo. The textual analysis is
fairly constrained to the content of the text as well as arrangement. While the parameters that the system works
on are understood, “How” the system reaches its conclusions as well as what
patterns it recognises are hard to discern. This is due to the nature of neural
networks and their functioning, i.e the individual nodes and their function in
the larger network is largely unknown. So we know the input which is the audio
and transcripts and the output which is the model’s assessment. Figuring out how it reached that conclusion
is the next step forward.
The researchers hope that this model paves
the way for different, specialised versions which can diagnose other impairments
such as Dementia. The way forward for the model is a
symbiotic relation between the clinician and the model. Improving the
accuracy of the model as well as increase the understanding that we have of the
inner workings of the neural networks.
Matheson, Rob. (2018, August 29). Model can more naturally detect depression
in conversations. MIT News. Retrieved from http://news.mit.edu/2018/neural-network-model-detect-depression-conversations-0830
Al Hanai, T., Ghassemi, M., Glass, J.
(2018) Detecting Depression with
Audio/Text Sequence Modeling of Interviews. Proc. Interspeech 2018,
1716-1720, DOI: 10.21437/Interspeech.2018-2522.
Comments
Post a Comment