Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map tensorflow/c/c_api_experimental.h header file #410

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

saudet
Copy link
Contributor

@saudet saudet commented Jan 18, 2022

So that we can call goodies like TF_LoadPluggableDeviceLibrary(), see issues #409.

/cc @ashesfall

@karllessard karllessard added CI build Triggers a full native build on a pull request DON'T MERGE This pull request is temporarily on hold and removed DON'T MERGE This pull request is temporarily on hold labels Jan 18, 2022
@karllessard
Copy link
Collaborator

Thanks @saudet , looks good, I’ve triggered a CI build before merging

@karllessard
Copy link
Collaborator

@saudet , I got a bunch of errors on Windows, any clue?

jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_SetLogicalCpuDevices
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_CheckpointReaderGetVariable
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_ContextEnableGraphCollection
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_DeleteAttrBuilder
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_ContextOptionsSetTfrt
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_ExecutorWaitForAllPendingNodes
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_InsertConfigKeyValue
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_SetXlaMinClusterSize
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_CancellationManagerStartCancel
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_MonitoringDeleteStringGauge3
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_ContextGetExecutorForThread
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_DeleteShapeAndTypeListArray
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_SetXlaAutoJitMode
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_NewAttrBuilder
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_MonitoringDeleteStringGauge4
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_GraphDebugString
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_ImportGraphDefOptionsSetValidateColocationConstraints
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_ContextOptionsSetTfrtDistributedRuntime
jnitensorflow.obj : error LNK2001: unresolved external symbol TFE_GetServerDef
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_CreatePackedTensorHandle
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_TensorHandleDeviceType
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_MonitoringNewStringGauge3
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_CheckpointReaderGetVariableShape
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_NewCheckpointReader
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_DeleteCheckpointReader
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_NewTensorHandleFromTensor
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_AttrBuilderSetType
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_CheckpointReaderSize
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_ShapeAndTypeListSetDtype
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_CheckpointReaderGetTensor
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_InitMain
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_GetContextId
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_HostAddressSpace
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_ShapeAndTypeListSetShape
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_ShapeAndTypeListSetUnknownShape
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_GetExecutedOpNames
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_CancellationManagerIsCancelled
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_CreateRunOptions
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_SetXlaConstantFoldingDisabled
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_NewCancellationManager
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_CheckpointReaderHasTensor
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_ContextGetFunctionDef
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_TensorHandleGetStatus
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_GetNumberAttrForOpListInput
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_DeletePluggableDeviceLibraryHandle
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_CheckpointReaderGetVariableDataType
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_TensorHandleDeviceID
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_ExecutorClearError
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_NewExecutor
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_SetXlaEnableLazyCompilation
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_CheckpointReaderGetVariableNumDims
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_DeleteExecutor
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_SetTfXlaCpuGlobalJit
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_InferShapes
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_PickUnusedPortOrDie
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_FunctionDebugString
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_LoadPluggableDeviceLibrary
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_AttrBuilderSetTypeList
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_GetConfigKeyValue
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_AbortCollectiveOps
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_MonitoringDeleteBuckets
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_EnqueueNamedTensor
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_EnableCollectiveOps
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_ExecutorIsAsync
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_DeleteCancellationManager
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_AllocateHostTensor
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_MonitoringNewExponentialBuckets
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_NewShapeAndTypeList
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_GetXlaConstantFoldingDisabled
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_MakeInternalErrorStatus
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_OpSetCancellationManager
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_CollectiveOpsCheckPeerHealth
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_MonitoringNewStringGauge4
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_DequeueNamedTensor
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_ContextSetSoftDevicePlacement
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_DeleteConfigKeyValue
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_EnableXLACompilation
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_ContextDisableGraphCollection
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_ReportErrorToCluster
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_ContextSetExecutorForThread
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_ContextSetLogDevicePlacement
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_CreateConfig
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_DeleteShapeAndTypeList
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_AttrBuilderCheckCanRunOnDevice
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_NewTensorHandleFromScalar
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TFE_OpReset
jnitensorflow.obj : error LNK2001: unresolved external symbol __imp_TF_OpIsStateful
jnitensorflow.dll : fatal error LNK1120: 87 unresolved externals

@saudet
Copy link
Contributor Author

saudet commented Jan 19, 2022

Hum, it looks like the experimental APIs are not exported on Windows. I remember bumping into that previously. I suppose we can always hack the Bazel build to export them... TF Core's build seems to be using this Python script to exclude symbols, but I don't see where it's removing any of the "experimental" functions:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/def_file_filter/def_file_filter.py.tpl

Ah, yes, I remember now. The experimental APIs are not actually compiled at all by Bazel. We need to compile them separately, like this:

<compilerOptions>
<!-- TODO: Remove files from here as they get integrated into the Bazel build -->
<compilerOption>${project.basedir}/bazel-${project.artifactId}/external/org_tensorflow/tensorflow/c/eager/gradients.cc</compilerOption>
</compilerOptions>

The problem with that though is that the experimental APIs themselves need to link with the internal API, which is not (cannot) be exported on Windows... 🤦 Why can't they just create multiple DLLs like PyTorch does it, seriously.

It does work on Linux and Mac though, but then we're alienating Windows users...

@Craigacp
Copy link
Collaborator

If we're doing this mainly for pluggable device support then I think the only extant device is the one Apple released for macOS, so it doesn't matter too much if it's not working on Windows.

@saudet
Copy link
Contributor Author

saudet commented Jan 19, 2022

Well well well, look at that, the Python binaries on PyPI do export it on Windows:

$ winedump -j export python/_pywrap_tensorflow_internal.pyd | grep TF_LoadPluggableDeviceLibrary
  0089FAB0 42047 TF_LoadPluggableDeviceLibrary

So this sounds like a bug in the C/C++ build of TF Core.

@rnett
Copy link
Contributor

rnett commented Jan 19, 2022

I exposed a bunch of things manually as part of the custom gradients PR, you might be able to add this to this list.

@saudet
Copy link
Contributor Author

saudet commented Jan 19, 2022

I exposed a bunch of things manually as part of the custom gradients PR, you might be able to add this to this list.

Thanks! But it's not the symbols, but the functions themselves that are missing. For some reason Bazel doesn't compile the experimental APIs when building libraries for the C/C++ API, but it does compile them when building for the Python API.

@cramasam
Copy link

Hi,

Is the issue with this PR still persists? We support our backend in TensorFlow through pluggabledevice. To provide the Java interface we require the support for TF_LoadPluggableDeviceLibrary(). Can we check on this PR again for pluggabledevice support?

@Craigacp, @saudet,

Thank you.

@Craigacp
Copy link
Collaborator

We'd need to see if it's supported by the binaries on PyPI for the relevant platforms as we now use those rather than building the TF native library ourselves. What platforms do you need?

@cramasam
Copy link

We'd need to see if it's supported by the binaries on PyPI for the relevant platforms as we now use those rather than building the TF native library ourselves. What platforms do you need?

We need for 'linux-x86_64'. Our plugins work with the native TF (https://pypi.org/project/tensorflow/).

@Craigacp
Copy link
Collaborator

Ok, we can take a look at it after 1.0 is released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI build Triggers a full native build on a pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants