Replies: 4 comments 10 replies
-
This is a fair point, and something that I'm guilty of overlooking. I haven't spent much time thinking about how to get to the point of giving data to the library. I don't think FlatSharp is unusual in that regard (most serializers I'm familiar with require you to provide a fully populated object graph). An interesting exception is the Google FlatBuffer library, which exposes a builder pattern that deals with problems such as this. To be contrarian for a moment, have you considered pooling those objects, rather than making new ones each time? FlatSharp won't use them any more after the serialize operation finishes, so you could do something along the lines of: private static readonly ThreadLocal<Queue<KeyValue>> SimplePool = new(() => new Queue<KeyValue>());
public void Class1000()
{
var kvs = new List<KeyValue>();
var map = new Map
{
Items = kvs
};
var buffer = new byte[1024];
var now = DateTime.Now.ToFileTimeUtc();
var rowsCount = rows.Length;
var headerCount = header.Length;
var pool = SimplePool.Value;
for (int r = 0; r < 1000; r++)
{
var row = rows[r % rowsCount];
for (int col = 0; col < row.Length && col < headerCount; col++)
{
if (string.IsNullOrWhiteSpace(row[col].String)) continue;
if (!pool.TryDequeue(out KeyValue kv))
{
kv = new KeyValue();
}
kv.Key = header[col];
kv.Value = row[col];
kv.Timestamp = row;
kvs.Add(kv);
}
// serialize Kvs
//var writtenSize = serialize(map, buffer);
foreach (var item in kvs)
{
pool.Enqueue(item);
}
kvs.Clear();
}
} I'm sure there are even more clever ways you could do the pooling. The benchmark is missing some code so I can't run it on my box. |
Beta Was this translation helpful? Give feedback.
-
@jamescourtney I've create a branch with these benchmarks in it at #214 so you can play around if you like.
|
Beta Was this translation helpful? Give feedback.
-
I'm beginning to get my head around the idea of parallel reference/value types. Part of me still really thinks that "We're using C# -- allocation and GC are ways of life here". However -- I get that there are Unity users and other folks with true high-performance needs that would benefit from something like this. However, I don't think FlatSharp is going to be willing/able to do parsing of value-type tables. That's a huge work item, and ends up obscuring a lot of the value that Flatsharp provides (different deserialization modes, etc). Those will need to remain references for the foreseeable future, so please understand that that will probably remain a "won't fix". Serializing value-type tables is (relatively) straightforward with some modest refactoring. Still need to prototype this and think through some things pretty thoroughly (ie, should value tables be allowed to contain reference tables? or is it turtles all the way down?), so please don't take this as a commitment or anything like that. |
Beta Was this translation helpful? Give feedback.
-
Took a look at your recycling benchmark, and there are some simple ways to improve it. Here are my results:
Here's the code. There are two sources of inefficiency in the pooling approach. The first is that repeatedly accessing Based on these numbers, I further suspect that most of the inefficiencies you see now are due to the Union type being a class and not a struct. I have a feeling that if you combined pooling and value unions, you'd see numbers roughly on par with the struct-based approach. private static readonly ThreadLocal<List<KeyValue>> SimplePool = new(() => new List<KeyValue>());
[Benchmark]
public void Class1000Recycled()
{
var kvs = SimplePool.Value;
var buffer = new byte[1024];
var now = DateTime.Now.ToFileTimeUtc();
var rowsCount = rows.Length;
var headerCount = header.Length;
for (int r = 0; r < 1000; r++)
{
int count = 0;
var row = rows[r % rowsCount];
for (int col = 0; col < row.Length && col < headerCount; col++)
{
string rowValue = row[col];
if (string.IsNullOrWhiteSpace(rowValue))
{
continue;
}
KeyValue kv;
if (count < kvs.Count)
{
kv = kvs[count];
}
else
{
kv = new();
kvs.Add(kv);
}
kv.Key = new Primitive(header[col]);
kv.Value = new Primitive(rowValue);
kv.Timestamp = now;
count++;
}
ListWithKnownLength<KeyValue> list = new ListWithKnownLength<KeyValue>(kvs, count);
var map = new Map
{
Items = list
};
// serialize Kvs
//var writtenSize = serialize(map, buffer);
//kvs.Clear();
}
}
private class ListWithKnownLength<T> : IList<T>
{
private List<T> inner;
private int length;
public ListWithKnownLength(List<T> inner, int length)
{
this.inner = inner;
this.length = length;
}
public T this[int index]
{
get => inner[index];
set => throw new NotImplementedException();
}
public int Count => this.length;
public bool IsReadOnly => true;
public void Add(T item)
{
throw new NotImplementedException();
}
public void Clear()
{
throw new NotImplementedException();
}
public bool Contains(T item)
{
return inner.Contains(item);
}
public void CopyTo(T[] array, int arrayIndex)
{
inner.CopyTo(array, arrayIndex);
}
public IEnumerator<T> GetEnumerator()
{
return inner.GetEnumerator();
}
public int IndexOf(T item)
{
return inner.IndexOf(item);
}
public void Insert(int index, T item)
{
throw new NotImplementedException();
}
public bool Remove(T item)
{
throw new NotImplementedException();
}
public void RemoveAt(int index)
{
throw new NotImplementedException();
}
IEnumerator IEnumerable.GetEnumerator()
{
return ((IEnumerable)inner).GetEnumerator();
}
} |
Beta Was this translation helpful? Give feedback.
-
The Problem
Before I can serialize, I must create the objects to be serialized. In many cases these objects must be heap allocated classes. So this is an attempt at improving use cases like parsing a file, or receiving data in another format before getting it into a buffer.
A Solution
Generate struct duals of Tables that can be used in the serialization process.
Some data
Sample Benchmark
Beta Was this translation helpful? Give feedback.
All reactions