Skip to content

silvadenisaraujo/machine-learning-capstone-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Machine Learning Engineer Nanodegree

Capstone Proposal

Dênis Araújo da Silva March 25th, 2019

Proposal

(approx. 2-3 pages)

Domain Background

(approx. 1-2 paragraphs)

Syphilis is an infectious disease transmitted through sex or vertically during pregnancy. It is characterized by periods of activity and latency, disseminated systemic involvement, and progression to acute complications in patients that remain untreated or have been inadequately treated. Syphilis is known since the 15th century and studied by all medical specialties, particularly by Dermatology. The etiologic agent Treponema pallidum has never been cultured and was described over 100 years ago. The disease has been effectively treated with penicillin since 1943, but it remains an important health problem in developed and developing countries. Given its transmission characteristics, the condition has accompanied the behavioral changes in society in recent years and has become even more important due to the possibility of increasing the risk of transmitting acquired immunodeficiency syndrome. New laboratory tests and methods of control aimed at appropriate treatment of patients and their partners, use of condoms, and dissemination of information to the population comprise some measures to control syphilis adopted by health program organizers.

The World Health Organization (WHO) esteems that there are 340 million new cases of curable sexually transmitted diseases (STD) - syphilis, gonorrhea, chlamydial infection, trichomoniasis - and 12 million cases in Brazil. Prevalence data in the Tropics show that, according to the region, syphilis is the second or third cause of genital ulcers (others are chancroid and genital herpes).2 There was a recrudescence of syphilis in Ireland, Germany and American cities such as San Francisco and Los Angeles, in risk behavior groups, such as homosexual men (MSM) and sex professionals.5-8 There was an 11.2% increase in primary syphilis in the United States, raising from 7177 cases, in 2003, to 7980, in 2004. 9

Regarding congenital syphilis, data collected in pre-natal programs and maternities showed an elevated seroprevalence, especially in African countries.10-12 In Brazil, there were an estimated 843300 cases of syphilis in 2003. Since it is not a compulsory notification disease, epidemiological studies are carried out in facilities that treat STD or selected groups of patients, such as pregnant women, soldiers, prisoners, etc.13-15 Congenital syphilis cases recorded between 1998 and 2004 totaled 24448.13,16,17

Problem Statement

(approx. 1 paragraph)

In this section, clearly describe the problem that is to be solved. The problem described should be well defined and should have at least one relevant potential solution. Additionally, describe the problem thoroughly such that it is clear that the problem is quantifiable (the problem can be expressed in mathematical or logical terms) , measurable (the problem can be measured by some metric and clearly observed), and replicable (the problem can be reproduced and occurs more than once).

Datasets and Inputs

(approx. 2-3 paragraphs)

In this section, the dataset(s) and/or input(s) being considered for the project should be thoroughly described, such as how they relate to the problem and why they should be used. Information such as how the dataset or input is (was) obtained, and the characteristics of the dataset or input, should be included with relevant references and citations as necessary It should be clear how the dataset(s) or input(s) will be used in the project and whether their use is appropriate given the context of the problem.

Interesting Metrics:

Condom use, population ages 15-24, female (% of females ages 15-24) Syphilis antenatal Cause of death, by communicable diseases and maternal, prenatal and nutrition conditions (% of total) Pregnant women receiving prenatal care (%) Low-birthweight babies (% of births) Current health expenditure (% of GDP) Nurses and midwives (per 1,000 people) Community health workers (per 1,000 people) population

Solution Statement

(approx. 1 paragraph)

In this section, clearly describe a solution to the problem. The solution should be applicable to the project domain and appropriate for the dataset(s) or input(s) given. Additionally, describe the solution thoroughly such that it is clear that the solution is quantifiable (the solution can be expressed in mathematical or logical terms) , measurable (the solution can be measured by some metric and clearly observed), and replicable (the solution can be reproduced and occurs more than once).

Benchmark Model

(approximately 1-2 paragraphs)

In this section, provide the details for a benchmark model or result that relates to the domain, problem statement, and intended solution. Ideally, the benchmark model or result contextualizes existing methods or known information in the domain and problem given, which could then be objectively compared to the solution. Describe how the benchmark model or result is measurable (can be measured by some metric and clearly observed) with thorough detail.

Evaluation Metrics

(approx. 1-2 paragraphs)

In this section, propose at least one evaluation metric that can be used to quantify the performance of both the benchmark model and the solution model. The evaluation metric(s) you propose should be appropriate given the context of the data, the problem statement, and the intended solution. Describe how the evaluation metric(s) are derived and provide an example of their mathematical representations (if applicable). Complex evaluation metrics should be clearly defined and quantifiable (can be expressed in mathematical or logical terms).

Project Design

(approx. 1 page)

In this final section, summarize a theoretical workflow for approaching a solution given the problem. Provide thorough discussion for what strategies you may consider employing, what analysis of the data might be required before being used, or which algorithms will be considered for your implementation. The workflow and discussion that you provide should align with the qualities of the previous sections. Additionally, you are encouraged to include small visualizations, pseudocode, or diagrams to aid in describing the project design, but it is not required. The discussion should clearly outline your intended workflow of the capstone project.

References

DataSet: https://www.kaggle.com/theworldbank/world-bank-health-population https://www.kaggle.com/paultimothymooney/how-to-query-the-world-bank-ghnp-data https://www.kaggle.com/sohier/introduction-to-the-bq-helper-package

  1. Goh BT. Syphilis in adult. Sex Transm Infect. 2005;81: 448-52.
  2. Hopkins S, Lyons F, Coleman C, Courtney G, Bergin C, Mulcahy F. Resurgence in infectious syphilis in Ireland: an epidemiological study. Sex Transm Dis. 2004;31:317-21.
  3. Marcus U, Kollan C, Bremer V, Hamouda O. Relation between the HIV and the re-emerging syphilis epidemic among MSM in Germany: an analisis based on anonymous surveillance data. Sex Transm Dis. 2005;81:456-7.
  4. Brasil. Ministério da Saúde. Diretrizes de Controle da Sífilis Congênita. Brasília (DF): Ministério da Saúde;
  5. p. 7-53.
  6. Codes JS, Cohen DA, Melo NA, Teixeira GG, Leal Ados S, Silva Tde J, et al. Screening of sexually transmitted diseases in clinical and non-clinical settings in Salvador, Bahia, Brazil. Cad Saude Publica. 2006;22:325-34.
  7. Szwarcwald CL, de Carvalho MF, Barbosa Junior A, Barreira D, Speranza FA, de Castilho EA. Temporal trends of HIV-related risk behavior among Brazilian military conscripts. Clinics. 2005;60:367-74.
  8. Brasil. Ministério da Saúde. Manual de Controle das Doenças Sexualmente Transmissíveis. 3. ed. Brasília (DF): Ministério da Saúde; 1999. p. 44-54.
  9. Miranda AE, Alves MC, Neto RL, Areal KR, Gerbase AC. Seroprevalence of HIV, hepatitis B virus, and syphilis in womens at their first visit to public antenatal clinics in Vitoria, Brazil. Sex Transm Dis. 2001;28:710-3.

Before submitting your proposal, ask yourself. . .

  • Does the proposal you have written follow a well-organized structure similar to that of the project template?
  • Is each section (particularly Solution Statement and Project Design) written in a clear, concise and specific fashion? Are there any ambiguous terms or phrases that need clarification?
  • Would the intended audience of your project be able to understand your proposal?
  • Have you properly proofread your proposal to assure there are minimal grammatical and spelling mistakes?
  • Are all the resources used for this project correctly cited and referenced?

About

Capstone project regarding Machine Learning Engineer Nanodegree

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published