Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracking_column is case sensitive #346

Open
romain-chanu opened this issue Aug 14, 2019 · 4 comments
Open

tracking_column is case sensitive #346

romain-chanu opened this issue Aug 14, 2019 · 4 comments
Labels

Comments

@romain-chanu
Copy link

Enviromment

  • Version: ES / Logstash version 6.7 (same problem might be observed in higher version).
  • Database tested:
    • MySQL version 8 + mysql-connector-java-8.0.13.jar
    • MySQL version 8 + mysql-connector-java-8.0.17.jar

Problem

Documentation is confusing. From my understanding,tracking_column refers to a column name from a database table. For example, we could imagine a database table Users with a column named UserID.

If I specify the tracking_column to be UserID (and keeping the lowercase_column_names to the default value which is true), then Logstash will log the following error:

tracking_column not found in dataset. {:tracking_column=>"UserID"}

As mentioned in the documentation, each row in the resultset becomes a single event. Columns in the resultset are converted into fields in the event.

If my understanding is correct, Logstash convert the columns names into lowercase and it set the event fields names with the same lowercase values. In that situation, the tracking_column value should also be in lowercase.

Given a database column named UserID, then there could be two configuration possibles depending on one's business needs:

Option 1:: Event fields names should be the same as the column names (i.e. case sensitivity is respected):

lowercase_column_names => "false"
tracking_column => "UserID"

Option 2:: Event field names should be in lowercase:

lowercase_column_names => "true"
tracking_column => "userid"

Proposal/Suggestion

To avoid confusion, I think the documentation should mention that tracking_column value is case sensitive. Its value should be the column name in lowercase if lowercase_column_names is set to true. Otherwise, its value will be the same as the column name (case sensitive).

@guyboertje
Copy link
Contributor

@romain-chanu @karenzone

From my understanding, tracking_column refers to a column name from a database table

The setting tracking_column actually refers to a field in the event. It might as well have been called field_that_provides_a_value_to_track :-)
The setting to preserve case of column to field name was added long after the tracking settings and docs. We simply did not think it through to the docs.
We should add a note to clarify.

@romain-chanu
Copy link
Author

romain-chanu commented Aug 15, 2019

@guyboertje : I agree with you and totally understand your point. If we just read the documentation, tracking_column is The column whose value is to be tracked if use_column_value is set to true - hence the confusion. I figured that it was in fact the field and not the column. The documentation definitely needs to be updated to mention all the above points 😃

@karenzone
Copy link
Contributor

@romain-chanu: What do you think about this?

The column containing the value to be tracked. Used only if use_column_valueis set totrue.`

@guyboertje: I welcome your input as well.

@romain-chanu
Copy link
Author

@karenzone : it is about the same as the current documentation. That does not reflect what we discussed.

I would suggest something along that line:

tracking_column --> The column name whose value is to be tracked if use_column_valueis set to true. Its value is equal to the column name in lowercase if lowercase_column_names is set to true. Otherwise, its value is equal to the database column name (case sensitive).

@guyboertje I know tracking_column is following the event field name - but it might be confusing to mention event field when the parameter name is actually tracking_column. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants