346 lines
16 KiB
Plaintext
346 lines
16 KiB
Plaintext
T A B L E o f C O N T E N T S
|
||
---------------------------------
|
||
|
||
1 Building Adobe XMPsdk and Samples in Terminal with the ./Generate_XXX_mac.sh scripts
|
||
1.1 Amazing Discovery 1 DumpFile is linked to libstdc++.6.dylib
|
||
1.2 Amazing Discovery 2 Millions of "weak symbol/visibility" messages
|
||
|
||
4 Build design for v0.26.1
|
||
4.8 Support for MinGW
|
||
|
||
5 Refactoring the Tiff Code
|
||
5.1 Background
|
||
5.2 How does Exiv2 decode the ExifData in a JPEG?
|
||
5.3 How is metadata organized in Exiv2
|
||
5.4 Where are the tags defined?
|
||
5.5 How do the MakerNotes get decoded?
|
||
5.6 How do the encoders work?
|
||
|
||
6 Using external XMP SDK via Conan
|
||
|
||
==========================================================================
|
||
|
||
4 Build design for v0.26.1
|
||
|
||
Added : 2017-08-18
|
||
Modified: 2017-08-23
|
||
|
||
The purpose of the v0.26.1 is to release bug fixes and
|
||
experimental new features which may become defaults with v0.27
|
||
|
||
4.8 Support for MinGW
|
||
MinGW msys/1.0 was deprecated when v0.26 was released.
|
||
No support for MinGW msys/1.0 will be provided.
|
||
It's very likely that the MinGW msys/1.0 will build.
|
||
I will not provide any user support for MinGW msys/1.0 in future.
|
||
|
||
MinGW msys/2.0 might be supported as "experimental" in Exiv2 v0.26.2
|
||
|
||
|
||
==========================================================================
|
||
|
||
5 Refactoring the Tiff Code
|
||
|
||
Added : 2017-09-24
|
||
Modified: 2017-09-24
|
||
|
||
5.1 Background
|
||
Tiff parsing is the root code of a metadata engine.
|
||
|
||
The Tiff parsing code in Exiv2 is very difficult to understand and has major architectural shortcomings:
|
||
|
||
1) It requires the Tiff file to be totally in memory
|
||
2) It cannot handle BigTiff
|
||
3) The parser doesn't know the source of the in memory tiff image
|
||
4) It uses memory mapping on the tiff file
|
||
- if the network connection is lost, horrible things happen
|
||
- it requires a lot of VM to map the complete file
|
||
- BigTiff file can be 100GB+
|
||
- The memory mapping causes problems with Virus Detection software on Windows
|
||
5) The parser cannot deal with multi-page tiff files
|
||
6) It requires the total file to be in contiguous memory and defeats 'webready'.
|
||
|
||
The Tiff parsing code in Exiv2 is ingenious. It's also very robust. It works well. It can:
|
||
|
||
1) Handle 32-bit Tiff and Many Raw formats (which are derived from Tiff)
|
||
2) It can read and write Manufacturer's MakerNotes which are (mostly) in Tiff format
|
||
3) It probably has other great features that I haven't discovered
|
||
- because the code is so hard to understand, I can't simply browse and read it.
|
||
4) It separates file navigation from data analysis.
|
||
|
||
The code in image::printStructure was originally written to understand "what is a tiff?"
|
||
It has problems:
|
||
1) It was intended to be a single threaded debugging function and has security issues.
|
||
2) It doesn't handle BigTiff
|
||
3) It's messy. It's reading and processing metadata simultaneously.
|
||
|
||
The aim of this project is to
|
||
1) Reconsider the Tiff Code.
|
||
2) Keep everything good in the code and address known deficiencies
|
||
3) Establish a Team Exiv2 "Tiff Expert" who knows the code intimately.
|
||
|
||
5.2 How does Exiv2 decode the ExifData in a JPEG?
|
||
You can get my test file from http://clanmills.com/Stonehenge.jpg
|
||
|
||
808 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/build $ exiv2 -pS ~/Stonehenge.jpg
|
||
STRUCTURE OF JPEG FILE: /Users/rmills/Stonehenge.jpg
|
||
address | marker | length | data
|
||
0 | 0xffd8 SOI
|
||
2 | 0xffe1 APP1 | 15288 | Exif..II*......................
|
||
15292 | 0xffe1 APP1 | 2610 | http://ns.adobe.com/xap/1.0/.<?x
|
||
17904 | 0xffed APP13 | 96 | Photoshop 3.0.8BIM.......'.....
|
||
18002 | 0xffe2 APP2 | 4094 | MPF.II*...............0100.....
|
||
22098 | 0xffdb DQT | 132
|
||
22232 | 0xffc0 SOF0 | 17
|
||
22251 | 0xffc4 DHT | 418
|
||
22671 | 0xffda SOS
|
||
809 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/build $
|
||
|
||
Exiv2 calls JpegBase::readMetadata which locates the APP1/Exif segment.
|
||
It invokes the ExifParser:
|
||
ExifParser::decode(exifData_, rawExif.pData_, rawExif.size_);
|
||
This is thin wrapper over:
|
||
TiffParserWorker::decode(....) in tiffimage.cpp
|
||
|
||
What happens then? I don't know. The metadata is decoded in:
|
||
tiffvisitor.cpp TiffDecoder::visitEntry()
|
||
|
||
The design of the TiffMumble classes is the "Visitor" pattern
|
||
described in "Design Patterns" by Addison & Wesley. The aim of the pattern
|
||
is to separate parsing from dealing with the data.
|
||
|
||
The data is being stored in ExifData which is a vector.
|
||
Order is important and preserved.
|
||
As the data values are recovered they are stored as Exifdatum in the vector.
|
||
|
||
How does the tiff visitor work? I think the reader and processor
|
||
are connected by this line in TiffParser::
|
||
rootDir->accept(reader);
|
||
|
||
The class tree for the decoder is:
|
||
|
||
class TiffDecoder : public TiffFinder {
|
||
class TiffReader ,
|
||
class TiffFinder : public TiffVisitor {
|
||
class TiffVisitor {
|
||
public:
|
||
//! Events for the stop/go flag. See setGo().
|
||
enum GoEvent {
|
||
geTraverse = 0,
|
||
geKnownMakernote = 1
|
||
};
|
||
|
||
void setGo(GoEvent event, bool go);
|
||
virtual void visitEntry(TiffEntry* object) =0;
|
||
virtual void visitDataEntry(TiffDataEntry* object) =0;
|
||
virtual void visitImageEntry(TiffImageEntry* object) =0;
|
||
virtual void visitSizeEntry(TiffSizeEntry* object) =0;
|
||
virtual void visitDirectory(TiffDirectory* object) =0;
|
||
virtual void visitSubIfd(TiffSubIfd* object) =0;
|
||
virtual void visitMnEntry(TiffMnEntry* object) =0;
|
||
virtual void visitIfdMakernote(TiffIfdMakernote* object) =0;
|
||
virtual void visitIfdMakernoteEnd(TiffIfdMakernote* object);
|
||
virtual void visitBinaryArray(TiffBinaryArray* object) =0;
|
||
virtual void visitBinaryArrayEnd(TiffBinaryArray* object);
|
||
//! Operation to perform for an element of a binary array
|
||
virtual void visitBinaryElement(TiffBinaryElement* object) =0;
|
||
|
||
//! Check if stop flag for \em event is clear, return true if it's clear.
|
||
bool go(GoEvent event) const;
|
||
}
|
||
}
|
||
}
|
||
|
||
The reader works by stepping along the Tiff directory and calls the visitor's
|
||
"callbacks" as it reads.
|
||
|
||
There are 2000 lines of code in tiffcomposite.cpp and, to be honest,
|
||
I don't know what most of it does!
|
||
|
||
Set a breakpoint in src/exif.cpp#571.
|
||
That’s where he adds the key/value to the exifData vector.
|
||
Exactly how did he get here? That’s a puzzle.
|
||
|
||
void ExifData::add(const ExifKey& key, const Value* pValue)
|
||
{
|
||
add(Exifdatum(key, pValue));
|
||
}
|
||
|
||
5.3 How is metadata organized in Exiv2
|
||
section.group.tag
|
||
|
||
section: Exif | IPTC | Xmp
|
||
group: Photo | Image | MakerNote | Nikon3 ....
|
||
tag: YResolution etc ...
|
||
|
||
820 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ exiv2 -pa ~/Stonehenge.jpg | cut -d' ' -f 1 | cut -d. -f 1 | sort | uniq
|
||
Exif
|
||
Iptc
|
||
Xmp
|
||
|
||
821 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ exiv2 -pa --grep Exif ~/Stonehenge.jpg | cut -d'.' -f 2 | sort | uniq
|
||
GPSInfo
|
||
Image
|
||
Iop
|
||
MakerNote
|
||
Nikon3
|
||
NikonAf2
|
||
NikonCb2b
|
||
NikonFi
|
||
NikonIi
|
||
NikonLd3
|
||
NikonMe
|
||
NikonPc
|
||
NikonVr
|
||
NikonWt
|
||
Photo
|
||
Thumbnail
|
||
|
||
822 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ 533 rmills@rmillsmbp:~/Downloads $ exiv2 -pa --grep Exif ~/Stonehenge.jpg | cut -d'.' -f 3 | cut -d' ' -f 1 | sort | uniq
|
||
AFAperture
|
||
AFAreaHeight
|
||
AFAreaMode
|
||
...
|
||
XResolution
|
||
YCbCrPositioning
|
||
YResolution
|
||
534 rmills@rmillsmbp:~/Downloads $
|
||
823 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $
|
||
|
||
The data in IFD0 of is Exiv2.Image:
|
||
|
||
826 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ exiv2 -pR ~/Stonehenge.jpg | head -20
|
||
STRUCTURE OF JPEG FILE: /Users/rmills/Stonehenge.jpg
|
||
address | marker | length | data
|
||
0 | 0xffd8 SOI
|
||
2 | 0xffe1 APP1 | 15288 | Exif..II*......................
|
||
STRUCTURE OF TIFF FILE (II): MemIo
|
||
address | tag | type | count | offset | value
|
||
10 | 0x010f Make | ASCII | 18 | 146 | NIKON CORPORATION
|
||
22 | 0x0110 Model | ASCII | 12 | 164 | NIKON D5300
|
||
34 | 0x0112 Orientation | SHORT | 1 | | 1
|
||
46 | 0x011a XResolution | RATIONAL | 1 | 176 | 300/1
|
||
58 | 0x011b YResolution | RATIONAL | 1 | 184 | 300/1
|
||
70 | 0x0128 ResolutionUnit | SHORT | 1 | | 2
|
||
82 | 0x0131 Software | ASCII | 10 | 192 | Ver.1.00
|
||
94 | 0x0132 DateTime | ASCII | 20 | 202 | 2015:07:16 20:25:28
|
||
106 | 0x0213 YCbCrPositioning | SHORT | 1 | | 1
|
||
118 | 0x8769 ExifTag | LONG | 1 | | 222
|
||
STRUCTURE OF TIFF FILE (II): MemIo
|
||
address | tag | type | count | offset | value
|
||
224 | 0x829a ExposureTime | RATIONAL | 1 | 732 | 10/4000
|
||
236 | 0x829d FNumber | RATIONAL | 1 | 740 | 100/10
|
||
827 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ exiv2 -pa --grep Image ~/Stonehenge.jpg
|
||
Exif.Image.Make Ascii 18 NIKON CORPORATION
|
||
Exif.Image.Model Ascii 12 NIKON D5300
|
||
Exif.Image.Orientation Short 1 top, left
|
||
Exif.Image.XResolution Rational 1 300
|
||
Exif.Image.YResolution Rational 1 300
|
||
Exif.Image.ResolutionUnit Short 1 inch
|
||
Exif.Image.Software Ascii 10 Ver.1.00
|
||
Exif.Image.DateTime Ascii 20 2015:07:16 20:25:28
|
||
Exif.Image.YCbCrPositioning Short 1 Centered
|
||
Exif.Image.ExifTag Long 1 222
|
||
Exif.Nikon3.ImageBoundary Short 4 0 0 6000 4000
|
||
Exif.Nikon3.ImageDataSize Long 1 6173648
|
||
Exif.NikonAf2.AFImageWidth Short 1 0
|
||
Exif.NikonAf2.AFImageHeight Short 1 0
|
||
Exif.Photo.ImageUniqueID Ascii 33 090caaf2c085f3e102513b24750041aa
|
||
Exif.Image.GPSTag Long 1 4060
|
||
828 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $
|
||
|
||
The data in IFD1 is Exiv2.Photo
|
||
|
||
The data in the MakerNote is another embedded TIFF (which more embedded tiffs)
|
||
|
||
829 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ exiv2 -pa --grep MakerNote ~/Stonehenge.jpg
|
||
Exif.Photo.MakerNote Undefined 3152 (Binary value suppressed)
|
||
Exif.MakerNote.Offset Long 1 914
|
||
Exif.MakerNote.ByteOrder Ascii 3 II
|
||
830 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $
|
||
|
||
The MakerNote decodes them into:
|
||
|
||
Exif.Nikon1, Exiv2.NikonAf2 and so on. I don't know exactly it achieves this.
|
||
However it means that tag-numbers can be reused in different IFDs.
|
||
Tag 0x0016 = Nikon GPSSpeed and can mean something different elsewhere.
|
||
|
||
5.4 Where are the tags defined?
|
||
|
||
There's an array of "TagInfo" data structures in each of the makernote decoders.
|
||
These define the tag (a number) and the tag name, the groupID (eg canonId) and the default type.
|
||
There's also a callback to print the value of the tag. This does the "interpretation"
|
||
that is performed by the -pt in the exiv2 command-line program.
|
||
|
||
TagInfo(0x4001, "ColorData", N_("Color Data"), N_("Color data"), canonId, makerTags, unsignedShort, -1, printValue),
|
||
|
||
5.5 How do the MakerNotes get decoded?
|
||
|
||
I don't know. It has something to do with this code in tiffcomposite.cpp#936
|
||
|
||
TiffMnEntry::doAccept(TiffVisitor& visitor) { ... }
|
||
|
||
Most makernotes are TiffStructures. So the TiffXXX classes are invoked recursively to decode the maker note.
|
||
|
||
#0 0x000000010058b4b0 in Exiv2::Internal::TiffDirectory::doAccept(Exiv2::Internal::TiffVisitor&) at /Users/rmills/gnu/github/exiv2/exiv2/src/tiffcomposite.cpp:916
|
||
This function iterated the array of entries
|
||
|
||
#1 0x000000010058b3c6 in Exiv2::Internal::TiffComponent::accept(Exiv2::Internal::TiffVisitor&) at /Users/rmills/gnu/github/exiv2/exiv2/src/tiffcomposite.cpp:891
|
||
#2 0x00000001005b5357 in Exiv2::Internal::TiffParserWorker::parse(unsigned char const*, unsigned int, unsigned int, Exiv2::Internal::TiffHeaderBase*) at /Users/rmills/gnu/github/exiv2/exiv2/src/tiffimage.cpp:2006
|
||
This function creates an array of TiffEntries
|
||
|
||
#3 0x00000001005a2a60 in Exiv2::Internal::TiffParserWorker::decode(Exiv2::ExifData&, Exiv2::IptcData&, Exiv2::XmpData&, unsigned char const*, unsigned int, unsigned int, void (Exiv2::Internal::TiffDecoder::* (*)(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned int, Exiv2::Internal::IfdId))(Exiv2::Internal::TiffEntryBase const*), Exiv2::Internal::TiffHeaderBase*) at /Users/rmills/gnu/github/exiv2/exiv2/src/tiffimage.cpp:1900
|
||
#4 0x00000001005a1ae9 in Exiv2::TiffParser::decode(Exiv2::ExifData&, Exiv2::IptcData&, Exiv2::XmpData&, unsigned char const*, unsigned int) at /Users/rmills/gnu/github/exiv2/exiv2/src/tiffimage.cpp:260
|
||
#5 0x000000010044d956 in Exiv2::ExifParser::decode(Exiv2::ExifData&, unsigned char const*, unsigned int) at /Users/rmills/gnu/github/exiv2/exiv2/src/exif.cpp:625
|
||
#6 0x0000000100498fd7 in Exiv2::JpegBase::readMetadata() at /Users/rmills/gnu/github/exiv2/exiv2/src/jpgimage.cpp:386
|
||
#7 0x000000010000bc59 in Action::Print::printList() at /Users/rmills/gnu/github/exiv2/exiv2/src/actions.cpp:530
|
||
#8 0x0000000100005835 in Action::Print::run(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) at /Users/rmills/gnu/github/exiv2/exiv2/src/actions.cpp:245
|
||
|
||
|
||
5.6 How do the encoders work?
|
||
|
||
I understand writeMetadata() and will document that soon.
|
||
I still have to study how the TiffVisitor writes metadata.
|
||
|
||
|
||
6 Using external XMP SDK via Conan
|
||
|
||
Section 1 describes how to compile the newer versions of XMP SDK with a bash script. This
|
||
approach had few limitations:
|
||
|
||
1) We had to include sources from other projects into the Exiv2 repository: Check the folder
|
||
xmpsdk/third-party.
|
||
2) Different scripts for compiling XMP SDK on Linux, Mac OSX and Windows.
|
||
3) Lot of configuration/compilation issues depending on the system configuration.
|
||
|
||
Taking into account that during the last months we have done a big effort in migrating the
|
||
manipulation of 3rd party dependencies to Conan, we have decided to do the same here. A conan recipe
|
||
has been written for XmpSdk at:
|
||
|
||
https://github.com/piponazo/conan-xmpsdk
|
||
|
||
And the recipe and package binaries can be found in the piponazo's bintray repository:
|
||
|
||
https://bintray.com/piponazo/piponazo
|
||
|
||
This conan recipe provides a custom CMake finder that will be used by our CMake code to properly
|
||
find XMP SDK in the conan cache and then be able to use the CMake variables: ${XMPSDK_LIBRARY} and
|
||
${XMPSDK_INCLUDE_DIR}.
|
||
|
||
These are the steps you will need to follow to configure the project with the external XMP support:
|
||
|
||
# Add the conan-piponazo remote to your conan configuration (only once)
|
||
conan remote add conan-piponazo https://api.bintray.com/conan/piponazo/piponazo
|
||
|
||
mkdir build && cd build
|
||
|
||
# Run conan to bring the dependencies. Note that the XMPSDK is not enabled by default and you will
|
||
# need to enable the xmp option to bring it.
|
||
conan install .. --options xmp=True
|
||
|
||
# Configure the project with support for the external XMP version. Disable the normal XMP version
|
||
cmake -DCMAKE_BUILD_TYPE=Release -DEXIV2_ENABLE_XMP=OFF -DEXIV2_ENABLE_EXTERNAL_XMP=ON -DBUILD_SHARED_LIBS=ON ..
|
||
|
||
Note that the usage of the newer versions of XMP is experimental and it was included in Exiv2
|
||
because few users has requested it.
|