Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

environment .hdr files do not contain a description key and value entry #114

Closed
lewismc opened this issue Jul 4, 2017 · 6 comments
Closed

Comments

@lewismc
Copy link
Member

lewismc commented Jul 4, 2017

When I run an environmental task, a .hdr file is created with the following data, note it is missing a description

ENVI
samples = 739
lines = 14674
bands = 1
header offset = 0
file type = ENVI Classification
data type = 12
interleave = bip
byte order = 0
map info = { UTM , 1.000 , 1.000 , 724522.127 , 4074620.759 , 1.1000000000e+00 , 1.1000000000e+00 , 12 , North , WGS-84 , units=Meters , rotation=75.00000000 }
data ignore value = 0
class names = { No data , Data }
classes = 2
class lookup = { 0 , 0 , 0 , 255 , 0 , 0 }

whereas other tasks generate descriptions

ENVI
description = {
  COAL 0.5.2 mining classified image.}
samples = 739
lines = 14674
bands = 1
header offset = 0
file type = ENVI Classification
data type = 12
interleave = bip
byte order = 0
map info = { UTM , 1.000 , 1.000 , 724522.127 , 4074620.759 , 1.1000000000e+00 , 1.1000000000e+00 , 12 , North , WGS-84 , units=Meters , rotation=75.00000000 }
data ignore value = 0
class names = { No data , Schwertmannite BZ93-1 s06av95a=b , Renyolds_TnlSldgWet SM93-15w s06av95a=a , Renyolds_Tnl_Sludge SM93-15 s06av95a=a }
classes = 4
class lookup = { 0 , 0 , 0 , 255 , 0 , 0 , 0 , 255 , 0 , 0 , 0 , 255 }
@lewismc lewismc added this to the 0.6 milestone Jul 4, 2017
@ghost
Copy link

ghost commented Jul 5, 2017

@lewismc which .hdr file are you looking at? the environmental correlation module creates several byproducts during computation such as the rasterized map (feature_header_name) and the proximity map (proximity_header_name) which aren't necessarily data products we need to keep. the actual environmental correlation image should have a description as defined in this line:

'description': 'COAL '+pycoal.version+' environmental correlation image.',
. since the temporary files make a mess of the working directory, logic could be added to delete them if desired.

@ghost
Copy link

ghost commented Jul 5, 2017

Note that this behavior is already verified by the unit test:

assert actual.metadata.get('description') == 'COAL '+pycoal.version+' environmental correlation image.'

@lewismc
Copy link
Member Author

lewismc commented Jul 5, 2017

So it is... I didn't see that. Hmmm. The content from the .hdr file I generated locally is below.

ENVI
samples = 739
lines = 14674
bands = 1
header offset = 0
file type = ENVI Classification
data type = 12
interleave = bip
byte order = 0
map info = { UTM , 1.000 , 1.000 , 724522.127 , 4074620.759 , 1.1000000000e+00 , 1.1000000000e+00 , 12 , North , WGS-84 , units=Meters , rotation=75.00000000 }
data ignore value = 0
class names = { No data , Data }
classes = 2
class lookup = { 0 , 0 , 0 , 255 , 0 , 0 }

@ghost
Copy link

ghost commented Jul 6, 2017

What is the filename? What call did you make to generate it? Have you tried opening it in QGIS to see what it is?

The EnvironmentalCorrelation.intersect_proximity method takes a mining classified file and a vector layer to generate the intermediate and derivative files. Suppose the mining classified file is 12345_img_class_mining.hdr and the vector filename is streams.shp.

  1. The method will create a rasterized version of the vector named 12345_img_class_mining_streams.hdr in which every pixel is either 0 or 1. This file is only used to generate the proximity map.
  2. Next a proximity map is generated named 12345_img_class_mining_streams_proximity.hdr in which every pixel is a positive integer that counts the number of pixels to the nearest feature. This file is only used to generate the environmental correlation.
  3. Finally the mining classified file and the proximity map are compared pixelwise to produce the environmental correlation image whose name is taken from the correlated_filename argument. I have been naming these something like 12345_img_class_mining_streams_correlation.hdr.

For every .hdr file there is a corresponding .img file and possibly auxiliary files in .xml or some other format. I have been following a convention of appending to the filenames at each step so some get quite long (which might be a problem on certain fileystems).

The intermediate files should probably be deleted. I didn't delete them originally because they were useful for development, because there was a serious time constraint, and because I didn't want to worry about the implications of automatically removing things from the user's filesystem. It might be more elegant to use a temporary filesystem such as /tmp, but this might be operating-system specific (although so far the only target has been x86_64 GNU/Linux).

@ghost
Copy link

ghost commented Nov 8, 2017

I think the problem here is the intermediate files mentioned above. One option is to just delete them as they are probably not useful. Another option is to add the metadata to them as requested.

@ghost
Copy link

ghost commented Mar 2, 2018

Bringing this old issue up to date:

  1. If it is desired to preserve the aforementioned intermediate files after environmental correlation, then this issue can be considered to be superseded by Provide Unified Metadata Model (UMM)-compliant product metadata #118 which aims to standardize COAL metadata.
  2. Otherwise, logic remains to be implemented in environment.py to safely delete the files or build them in a temporary filesystem as mentioned. Another possibility would be to use something like the tempfile facility. It would be best to implement such a feature in a separate issue.
  3. Another option is to do nothing and leave the temporary files where they are, since it's not elegant but it works. If the deployment strategy involves processing on a temporary filesystem and only copying out the data that is desired, then this would have the same effect as deleting them along the way.

Thus I will close this issue because it either (1) has been superseded, (2) will be followed up in a separate issue, or (3) no longer requires attention.

@ghost ghost closed this as completed Mar 2, 2018
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant