#how can I remove duplicates of a certain object in vector?
66 messages · Page 1 of 1 (latest)
When your question is answered use !solved to mark the question as resolved.
Remember to ask specific questions, provide necessary details, and reduce your question to its simplest form. For more information use !howto ask.
You could use an unordered_map maybe?
I guess at some point you may want to access the object by ID.
PS: you could make that an ordered map too if you want.
But you pay a performance penalty and in the majority of implementations the actual order of the map matters not.
what would would be the "key" of the map ?
The ID
It's not a big deal if this ID was repeated in the object itself.
In case that has you puzzled.
yeah I found that a bit weird, the ID is already part of the object
what about removing duplicates manually? would that have a deterimental effect on performance ?
or is map better?
There are other ways.
For example, storing in a set rather than a vector.
But that requires some minor boilerplate code.
but how would a set actually determine if an object is unique?
Through the boilerplate code.
because in some case, satellite might have the same ID but different other parameters. In that case I want to remove that duplicate.
can you give advice on that, that sounds like a good idea actually
Give me a a few minutes to come up with an example.
How is this manual method? 1- Check if the count(overload "==" operator) of an object is greater than 1. 2- Then remove that object count many times until there is only 1 left.
The manual route would be via std::find_if(), I guess.
you could also std::sort by ID, then std::unique to remove consecutive duplicates, then vec.erase to get rid of them
but if you do this a lot, just keeping it sorted / in a std::set might be better? idk
ok thanks for help
Here it is. I am sure it can be bettered by some cleverer than I
#include <set>
#include <string>
#include <iostream>
#include <cassert>
class Satellite
{
friend bool operator!=(const Satellite& lh, const Satellite& rh);
public:
Satellite(int id, const std::string& name)
: id(id)
, name(name)
{}
void print() const
{
std::cout << id << " : " << name << std::endl;
}
private:
int id;
std::string name;
};
bool operator!=(const Satellite& lh, const Satellite& rh)
{
if (lh.id == rh.id)
return false;
if (lh.name == rh.name)
return false;
return true;
}
int main()
{
std::set<Satellite, std::not_equal_to<Satellite>> sats;
{
// success: first insertion always works
auto [i, inserted] = sats.emplace(1, "ABC");
assert(inserted);
}
{
// error: Satellite 1 already exists
auto [i, inserted] = sats.emplace(1, "DEF");
assert(!inserted);
}
{
// error: Satellite "ABC" already exists
auto [i, inserted] = sats.emplace(2, "ABC");
assert(!inserted);
}
{
// success: Satellite 2,"DEF" is unique
auto [i, inserted] = sats.emplace(2, "DEF");
assert(inserted);
}
for (const auto& s : sats)
{
s.print();
}
}
thanks for taking the time to write this 😅
Useful?
Not useful?
Useful !
previously I was doing this:
void CustomClass::removeDuplicates(QVector<QGeoSatelliteInfo>& satellites){
for (int i{};i<satellites.size();++i){
for (int k{};k<satellites.size();++k){
if (i==k){
continue;
}
else{
if (satellites[i].satelliteIdentifier() == satellites[k].satelliteIdentifier()){
satellites.erase(satellites.begin()+std::min(i,k));
}
}
}
}
}``` but this is like 0(N^2) I think
hmm, I'm not sure if not_equal is strictly forming a weak order
I'd just make an operator< and let it sort by ID
because in some case, satellite might have the same ID but different other parameters. In that case I want to remove that duplicate.
Perhaps I misunderstood @potent marten but I took this to mean that the multiple attributes must be unique.
I didn't think sorting was essential.
Just uniqueness.
well, set sorts
what I meant to say was that any satellite object with the duplicate ID must be removed regardless of the other attributes.(Long story, the reason this happens because older GPS data is built up in the serial port and we don't want that)
try it with MSVC, it has debug assertions for this stuff
https://godbolt.org/z/PxfqxcE9T
godbolt doesn't show that, but this is what fails with your version @oak ferry
if (_Result) {
_STL_VERIFY(!_Pred(_Right, _Left), "invalid comparator");
}
inside std::set
Perhaps unordered_set would be better, or indeed use set with the default less predicate.
So only the ID guarantees uniqueness.
I still think unordered_map would provide a better solution, because I can easily imagine you needing to get the Satellite by this ID at various points in your code.
Still, if you are certain that this is not the case, here is the amended code (hopefully correct this time @solid cove ?)
#include <set>
#include <string>
#include <iostream>
#include <cassert>
class Satellite
{
friend bool operator<(const Satellite& lh, const Satellite& rh);
public:
Satellite(int id, const std::string& name)
: id(id)
, name(name)
{}
void print() const
{
std::cout << id << " : " << name << std::endl;
}
private:
int id;
std::string name;
};
bool operator<(const Satellite& lh, const Satellite& rh)
{
return lh.id < rh.id;
}
int main()
{
std::set<Satellite> sats;
{
// success: first insertion always works
auto [i, inserted] = sats.emplace(1, "ABC");
assert(inserted);
}
{
// error: Satellite 1 already exists
auto [i, inserted] = sats.emplace(1, "DEF");
assert(!inserted);
}
{
// success: Satellite 2,"ABC" is unique
auto [i, inserted] = sats.emplace(2, "ABC");
assert(inserted);
}
{
// error: Satellite 2 already exists
auto [i, inserted] = sats.emplace(2, "GHI");
assert(!inserted);
}
for (const auto& s : sats)
{
s.print();
}
}
Note that you can now have two Satellites with the same name, but no two Satellites can exist with the same ID.
Or: ID guarantees uniqueness.
there is std::unique
yeah, I mentioned it above
but it also requires keeping things sorted
so might as well use an ordered container if you're gonna do it on each insert
I tested this and this is bad, when satellites.erase() is called it can crash the program due to i being out of range. Then I tried this(which works well and based on @solid cove 's advice): cpp void CustomClass::removeDuplicates(QVector<QGeoSatelliteInfo>& satellites) { std::sort(satellites.begin(), satellites.end(), compare_satelliteID); auto last = std::unique(satellites.begin(), satellites.end(), duplicate_IDcheck); satellites.erase(last, satellites.end()); } but which method is the fastest: 1- Use dimwits initial idea of QMap(similar to std::map) with the key being the satellite ID. 2- Stick to this code snippet. 3- Use the std::set shown by dimwit ?
performance is really important here because as soon as data is parsed from the serial port it needs to remove duplicates and emit another signal
sort + unique + erase lets you choose when you remove duplicates,
std::set will prevent duplicates on each insertion
a hash map (using the ID as the key) wouldn't need to sort
so I think a hash map or hash set should be faster
but you'd have to try it
I noticed you mentioned QMap.
If you are on Qt, consider QHash instead.
QMap == std::map
QHash == std::unordered_map
for example I could imagine decoding a few things from the serial port and putting them into the vector,
and only deduplicating once you want to use the data / think there won't be more data / ???
so you don't pay for the lookup every time you decode a thing
but if you have a hash map, the lookup should be cheap
so it might not matter 🤷
okay the QHash method works quite well. Now do I close this post right?
how do I mark it as solved?
¯_(ツ)_/¯
!solved
You can only solve threads you own
You can (should?) also right-click the thread in the tree-view, select tags, and add "solved".
@potent marten
okay I added the solved tag
You could also type !solved here
!solved
[SOLVED] how can I remove duplicates of a certain object in vector?