Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Customizable transfer files #9

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Conversation

ethack
Copy link
Contributor

@ethack ethack commented Aug 16, 2021

General approach:

  • Config file that determines the files that are sent. Located on AC-Hunter system.
  • Script pulls config file from AC-Hunter system first.
  • Script pushes logs according to config file to AC-Hunter system.

Closes: #3

Samuel Carroll added 4 commits August 11, 2021 11:21
…e sure we create the yaml file on destination during the connect sensor install
…ing WARNING: the changes to connect_sensor.sh have not been tested and is likely unstable.
…he send logs script to send 3 day default if it's not on the remote system
Copy link
Contributor Author

@ethack ethack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SamuelCarroll
Took a first pass at a review.
There looks to be some intend difference, maybe tabs vs. spaces? Can you fix those up as well please?

dest_log_trans=$2
local_dest_dir=$3

`rsync $destination:$dest_log_trans $local_dest_dir`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the backticks are necessary here.

Please quote the variables.

request_logs=`cat $local_dest_dir | grep -v : | grep -v #`

##TODO check if the request logs is set to all if so we should set the send_logs to the zeek_logs variable
if [[ " ${zeek_logs[*]} " == *"all"* ]]; then
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if the spaces around ${zeek_logs[*]} are intended. I don't think the *"all"* needs to be quoted. In general, things in [[ ]] don't need to be quoted. But strings with spaces or variables in [ ] or elsewhere do need to be quoted.

@@ -17,6 +17,48 @@ export PATH="/sbin:/usr/sbin:$PATH" #Note that cron does _NOT_ include /sbin in

default_user_on_aihunter='dataimport'

get_send_logs() {
##These names were found at https://docs.zeek.org/en/master/script-reference/log-files.html
declare -a zeek_logs=("conn" "dce_rpc" "dhcp" "dnp3" "dns" "ftp" "http" "irc" "kerberos" "modbus" "modbus_register_change" "mysql" "ntlm" "ntp" "radius" "rdp" "rfb" "sip" "smb_cmd" "smb_files" "smb_mapping" "smtp" "snmp" "socks" "ssh" "ssl" "syslog" "tunnel" "files" "ocsp" "pe" "x509" "netcontrol" "netcontrol_drop" "netcontrol_shunt" "netcontrol_catch_release" "openflow" "intel" "notice" "notice_alarm" "signatures" "traceroute" "known_certs" "known_hosts" "known_modbus" "known_services" "software" "barnyard2" "dpd" "unified2" "unknown_protocols" "weird" "weird_stats" "broker" "capture_loss" "cluster" "config" "loaded_scripts" "packet_filter" "print" "prof" "reporter" "stats" "stderr" "stdout" "conn-summary")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reformat these to be one per line and sorted alphabetically? That'll be easier to tell at a glance if we're missing a file or not.

##Ensure we only send the zeek logs
for log in $request_logs
do
if [[ " ${zeek_logs[*]} " == *"${log}"* ]]; then
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments here. I'm not sure if the spaces around ${zeek_logs[*]} are intended. I don't think the *"${log}"* needs to be quoted.

send_candidates=`find . -type f -mtime -3 -iname '*.gz' | egrep '(conn|dns|http|ssl|x509|known_certs)' | sort -u`

query="find . -type f -mtime -$request_days -iname '*.gz' | egrep '($logs_str)' | sort -u"
send_candidates=$(eval "$query")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will evaluate to a string, and I think it should be quoted: "$(eval "$query")"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing this works with quotes around the whole command "$(eval $query)" or around query like so $(eval "$query") but I can't have quotes around both like "$(eval $"query")"
I will push up a change for the quotes around the whole command

Copy link
Contributor Author

@ethack ethack Aug 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can have quotes like "$(eval "$query")" (but not like in your comment with $"query"). From other languages, you'd think that the nested quotes would conflict with each other but $() is a subshell and my understanding is any quotes inside are not matched with anything outside it.

Here's an example:

$ cat test.sh
#!/bin/bash

print_num_args() {
    echo $#
}

args="one two"

# this treats $args as two separate arguments
how_many="$(print_num_args $args)"
echo "without inner quotes: $how_many"

# this treats $args as a single argument
how_many="$(print_num_args "$args")"
echo "with inner quotes: $how_many"
$ bash test.sh
without inner quotes: 2
with inner quotes: 1

So the difference is how bash treats the argument when it has whitespace in it. Bash will split unquoted arguments on whitespace and quoted arguments will not be split.

That being said, I think you're ok in this case. I read the help page and eval takes multiple arguments and just concatenates them together with a space before running.

And I also just learned something. The quotes outside the subshell appear to be unnecessary even with whitespace. Take a look at this example:

$ cat test.sh       
#!/bin/bash

test1=one two three
echo "$test1"

test2=$(echo one two three)
echo "$test2"

The only difference in setting these two variables is that one is in a subshell and one is not. They are both unquoted. But the one in the subshell works and the other doesn't.

$ bash test.sh      
test.sh: line 3: two: command not found

one two three

So with the behavior of eval and subshells combined, I think quotes are completely unnecessary for this particular case. All of these should work and have slightly different nuances. And now I prefer your original line of code lol.

  • $(eval $query)
  • $(eval "$query") - This will likely be my pick in my code going forward since it reduces the quote soup. Until I learn about some new nuance of bash that bites me in the future.
  • "$(eval $query)"
  • "$(eval "$query")"

Bash is weird 🤦‍♂️

Copy link
Contributor Author

@ethack ethack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SamuelCarroll Round two :)

@@ -237,14 +347,25 @@ status "Sending logs to rita/aihunter server $aih_location , My name: $my_id , l
status "Preparing remote directories"
ssh $extra_ssh_params "$aih_location" "mkdir -p ${remote_top_dir}/$today/ ${remote_top_dir}/$yesterday/ ${remote_top_dir}/$twoda/ ${remote_top_dir}/$threeda/ ${remote_top_dir}/current/"

recv_loc="`pwd`/zeek-log-transport.yaml"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of pwd we should copy the file to a place we know is writable by the current user. Here's some sample code that might be useful.

recv_dir="$(mktemp)"
recv_loc="$recv_dir/zeek-log-transport.yaml"
# clean up the temporary files on exit
trap "rm -rf '$recv_dir'" EXIT

@@ -237,14 +347,25 @@ status "Sending logs to rita/aihunter server $aih_location , My name: $my_id , l
status "Preparing remote directories"
ssh $extra_ssh_params "$aih_location" "mkdir -p ${remote_top_dir}/$today/ ${remote_top_dir}/$yesterday/ ${remote_top_dir}/$twoda/ ${remote_top_dir}/$threeda/ ${remote_top_dir}/current/"

recv_loc="`pwd`/zeek-log-transport.yaml"
request_logs=`get_send_logs $aih_location "/etc/AI-Hunter/zeek-log-transport.yaml" "$recv_loc"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are cases where we will want different retention and logs from different sensors. If the sensor is small enough we can transfer all the logs and keep them for longer, but another sensor's logs might be too big where we need to limit what is sent and only keep it for a limited time.

I can think of a couple ways to do this:

  • We could pull from a sensor-specific location first (e.g. $remote_top_dir/zeek-log-transport.yaml) and fall back to /etc/AI-Hunter/zeek-log-transport.yaml if that fails. This seems the most straightforward to me.
  • We could put sensor-specific config sections in the same /etc/AI-Hunter/zeek-log-transport.yaml file. This would be easier if we were actually parsing structured YAML somehow, but it might be too difficult with the current method of using grep.

@@ -0,0 +1,72 @@
Days: 7
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change the default here to be 3 just to match the current behavior?

"weird"
"weird_stats"
"x509"
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not often this would happen, but adding a new log file would mean having to update this list and thus the installed script (which is what we're trying to avoid having to do). For instance, AC-Hunter recently added a new log file open_conn.log. If a user installs a plugin that generates a new log file or Zeek happens to start producing a new log we wouldn't be able to get that log without updating this script.

What if we just tested the string against a regex and only allowed ^[a-zA-Z0-9-_]+$? I'm not aware of anything that could be abused path traversal-wise with only those characters and I'm willing to take the tradeoff of allowing more character combinations in exchange for being more future-proof.

Days: 7
Logs:
#These headings are for user friendliness, they do not do anything
Default:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the headings don't do anything, I think it would be less confusing to make all the individual log files nested directly under Logs: (so de-indent one level) and make the Default:, Troubleshooting:, and All: comments instead of YAML keys.

Samuel Carroll and others added 2 commits September 9, 2021 12:53
…d the remote config to a temp directory, and checked a sensor specific location for a config before a default remote config in the zeek_log_transport.sh file, I have also commented out the YAML keys, removed one level of log names and changed default days to 3 in zeek-transport-default.yaml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow sending more Zeek logs
2 participants