Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mlcp exports binary documents with XML content of other documents #83

Open
jensriga opened this issue Jul 11, 2018 · 3 comments
Open

Mlcp exports binary documents with XML content of other documents #83

jensriga opened this issue Jul 11, 2018 · 3 comments

Comments

@jensriga
Copy link

Situation

I am using mlcp to export all documents of a database to the local filesystem. In the end, I have the correct number of local files, but some files that should be binaries actually contain XML content from other documents. The XML documents themselves are okay.

Steps to reproduce the issue

  1. Unzip the content of import.zip into a local folder C:\Temp\mlcp\import :
    files
  2. Use mlcp to import the files into an empty database:
    mlcp.bat import -host localhost -port 8070 -username **** -password **** -mode local -input_file_path C:\Temp\mlcp\import -output_uri_replace "/C:/Temp/mlcp/import,''"
  3. Observe content of database in Query Console:
    in_ml
    For comparison with later results and to make sure everything is still okay after the import I used XQuery to determine the size of all documents:
    for $doc in fn:doc() let $uri := fn:document-uri($doc) let $size := if (fn:exists($doc/binary())) then xdmp:binary-size($doc/binary()) else xdmp:binary-size(xdmp:unquote(xdmp:quote($doc),(),"format-binary")/binary()) order by $uri ascending return $uri || " -> " || $size
    size_in_ml
    Everything looks good so far.
  4. Use mlcp to export all documents to the local filesystem:
    mlcp.bat export -host localhost -port 8070 -username **** -password **** -mode local -output_file_path C:\Temp\mlcp\export
  5. Compare import and export directory:
    comparison
    The XML documents and 5 out of 8 binary documents are okay. The problem is, that image-003.gif and image-008.gif now have to same content as doc-A.xml and image-007.gif has the same content as doc-B.xml.

My system environment

  • Windows 10 Pro (1709)
  • MarkLogic Server 9.0-5.1
  • mlcp 9.0.6
  • java 1.8.0_172
@jensriga
Copy link
Author

Maybe this helps: I was able to reproduce the issue on a clean CentOS 7 VM with a new installation of MarkLogic Server.

  • CentOS Linux release 7.5.1804 (Core)
  • MarkLogic Server 9.0-6
  • mlcp 9.0.6
  • java: OpenJDK Runtime Environment (build 1.8.0_171-b10)

The only significant difference: under Linux all 8 binary files are broken, not just 3 out of 8 like under Windows 10

linux-compare

@mattsunsjf
Copy link
Contributor

Good bug report!

@mattsunsjf mattsunsjf self-assigned this Aug 9, 2018
@mattsunsjf mattsunsjf modified the milestones: 9.0.7, 9.0.8 Aug 9, 2018
@mattsunsjf mattsunsjf modified the milestones: 9.0.8, 9.0.9 Dec 4, 2018
@mattsunsjf mattsunsjf modified the milestones: 9.0.9, 10.0.1 Jan 28, 2019
@mattsunsjf mattsunsjf assigned yunzvanessa and unassigned mattsunsjf Mar 22, 2019
@mattsunsjf mattsunsjf modified the milestones: 10.0.1, 10.0.2 Mar 22, 2019
@dbarriguete
Copy link

import.zip
Hello, I have reviewed this situation and I have a minor change into "Export-binary-bug" branch, with this change export brings the correct file content to binary and text files.

Attached to this comment is a zip with more files for testing purposes.

@yunzvanessa yunzvanessa modified the milestones: 10.0.2, 10.0.3 Aug 24, 2019
@yunzvanessa yunzvanessa modified the milestones: 10.0.3, 10.0.5 Apr 1, 2020
@yunzvanessa yunzvanessa modified the milestones: 10.0.5, 10.0.6 Sep 12, 2020
@yunzvanessa yunzvanessa modified the milestones: 10.0.6, 10.0.8 May 22, 2021
@yunzvanessa yunzvanessa assigned abika5 and unassigned yunzvanessa Sep 27, 2021
@yunzvanessa yunzvanessa modified the milestones: 10.0.8, 10.0.9 Sep 27, 2021
@abika5 abika5 modified the milestones: 10.0.9, 10.0-10 Jan 28, 2022
@yunzvanessa yunzvanessa modified the milestones: 11.0.0, 11.1.0 May 15, 2023
@abika5 abika5 modified the milestones: 11.1.0, 11.2.0 Jan 3, 2024
@abika5 abika5 modified the milestones: 11.3.0, 11.4.0 Jun 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants