Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about Algorithm 1 Plane Clustering #3

Open
bluekingsong opened this issue Jun 4, 2019 · 1 comment
Open

Question about Algorithm 1 Plane Clustering #3

bluekingsong opened this issue Jun 4, 2019 · 1 comment

Comments

@bluekingsong
Copy link

Q1. Why Algorithm 1 's inputs is all segmentation result of a image( H*W points ), while its outputs is just only single one text bounding box ( 4 planes )

Q2. what's the detail about INITPLANES function? what parameters(A, B, D) is after calling the function ? I' cannot see from the paper.

Thanks !

@JingChaoLiu
Copy link
Collaborator

Q1. Why Algorithm 1 's inputs is all segmentation result of a image( H*W points ), while its outputs is just only single one text bounding box ( 4 planes )

A1: The plane clustering algorithm aims to rebuild the pyramid for one text mask basing on a single text region(text_mask = Tensor[C=1, H=28, W=28])). The input of this algorithm is indeed a text mask, not the whole image. For one image which normally contains a dozen text regions, the plane clustering will try to rebuild one pyramid for each text mask respectively.

Q2. what's the detail about INITPLANES function? what parameters(A, B, D) is after calling the function ? I' cannot see from the paper.

A2: Given the positive point list P = Tensor[point_num, Channel={x, y, z}], the INIT_PLANES will do these things:

  1. calculate out the approximate apex of the pyramid, noted as E.
E.x, E.y = mean(P[:, :2])
E.z = 1
  1. produce the initial planes. we simplily link the apex E to the corner points {(0, 0, 0), (0, 28, 0), (28, 28, 0), (28, 0, 0)} to form the four inclined plane.
    note: (0, 0, 0) means x=0, y=0, z=0

Another thing worth to mention:
During the procedure of clustering, this algorithm only cares about the four independent inclined planes. In other words, we only require the four inclined planes to be independent, without the constraint of sharing a common apex. So we can rebuild both pyramid and square frustum from the text mask.
1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants