2

I have a collection of Objects (Pos) with this model :

public class Pos { private String beforeChangement; private String type; private String afterChangement; } 

The list of objects is like this :

[ Pos(beforeChangement=Découvrez, type=VER, afterChangement=découvrir), Pos(beforeChangement=un, type=DET, afterChangement=un), Pos(beforeChangement=large, type=ADJ, afterChangement=large), Pos(beforeChangement=., type=SENT, afterChangement=.), Pos(beforeChangement=Livraison, type=NOM, afterChangement=livraison), Pos(beforeChangement=et, type=KON, afterChangement=et), Pos(beforeChangement=retour, type=NOM, afterChangement=retour), Pos(beforeChangement=., type=SENT, afterChangement=.), Pos(beforeChangement=achetez, type=VER, afterChangement=acheter), Pos(beforeChangement=gratuitement, type=ADV, afterChangement=gratuitement), Pos(beforeChangement=., type=SENT, afterChangement=.), Pos(beforeChangement=allez, type=VER, afterChangement=aller), Pos(beforeChangement=faites, type=VER, afterChangement=faire), Pos(beforeChangement=vite, type=ADV, afterChangement=vite), Pos(beforeChangement=chers, type=ADJ, afterChangement=cher), Pos(beforeChangement=clients, type=NOM, afterChangement=client)] Pos(beforeChangement=., type=SENT, afterChangement=.) ] 

I want to split this List of Objects by the the field of beforeChangement or afterChangement == "." to have this format (A List of List) List<List<SOP>> :

[ [Pos(beforeChangement=Découvrez, type=VER, afterChangement=découvrir), Pos(beforeChangement=un, type=DET, afterChangement=un), Pos(beforeChangement=large, type=ADJ, afterChangement=large)], [Pos(beforeChangement=Livraison, type=NOM, afterChangement=livraison), Pos(beforeChangement=et, type=KON, afterChangement=et), Pos(beforeChangement=retour, type=NOM, afterChangement=retour)], [Pos(beforeChangement=achetez, type=VER, afterChangement=acheter), Pos(beforeChangement=gratuitement, type=ADV, afterChangement=gratuitement)], [Pos(beforeChangement=allez, type=VER, afterChangement=aller), Pos(beforeChangement=faites, type=VER, afterChangement=faire), Pos(beforeChangement=vite, type=ADV, afterChangement=vite), Pos(beforeChangement=chers, type=ADJ, afterChangement=cher), Pos(beforeChangement=clients, type=NOM, afterChangement=client)] ] 

Is like performing an inverse flatMap to have a List of Array or List (Chunks) after splitting by a field of object that is the String "."

do you have any idea about how to do it using Streams ?

Thank you guys

5 Answers 5

2

hmm, I would like to solve your problem using a simple loop like this :

List<List<Pos>> result = new ArrayList<>(); List<Pos> part = new ArrayList<>(); for(Pos pos : listPos){ if(pos.getBeforeChangement().equals(".") || pos.getAfterChangement().equals(".")){ result.add(part);//If the condition is correct then add the sub list to result list part = new ArrayList<>();// and reinitialize the sub-list } else { part.add(pos);// else just put the Pos object to the sub-list } } //Just in case the listPos not end with "." values then the last part should not be escaped if(!part.isEmpty()){ result.add(part); } 

Note, the question is not clear enough your Object class is named SOP and the List of Object is Pos which one is correct, In my answer I based to public class Pos{..} instead of public class SOP{..}.


Sign up to request clarification or add additional context in comments.

4 Comments

Thank you for your answer I made a correction Pos and SOP is the same .
There's a significant problem with this solution, namely when the list of Pos does not end with ., your code will skip the entire last sentence.
No @TomaszLinkowski check the outputs in The question, the Object which have . is not include in the result. also you can compare with the outputs of the Question and the Outputs of the demo mentioned in my answer
You're right that - according to the question - the objects containing . should be skipped (I did not notice it, and my answer does not do it). But what I mean is that if the input list did not end with a "dot"-object, your code is simply skipping the entire last sublist instead of either including such sublist or throwing an error. See this clone of your snippet, where I removed the last line from listPos. Note that "allez", "faites", "vite", "chers", "clients" is missing from the output.
2

with StreamEx library you can use groupRuns method to split list for list of lists.

For example:

List<List<Pos>> collect = StreamEx.of(originalList.stream()) .groupRuns((p1, p2) -> !(".".equals(p2.beforeChangement) || ".".equals(p2.afterChangement))) .collect(Collectors.toList()); 

Method groupRuns returns Stream of lists. In example above it are lists where first element with ..

You can filter out these elements later. For example using map method:

StreamEx.of(originalList.stream()) .groupRuns((p1, p2) -> !(".".equals(p2.beforeChangement) || ".".equals(p2.afterChangement))) // returns Stream of lists with '.' element .map(l -> l.stream() .filter(p -> !(".".equals(p.beforeChangement) || ".".equals(p.afterChangement))) //filter out element with '.' .collect(Collectors.toList())) .filter(l -> !l.isEmpty()) // filter out empty lists .collect(Collectors.toList()); 

5 Comments

As far as I understand, though, this code will place the dots in separate lists, right?
how we can exclude the "." in the pipeline ?
Instead of map + filter related to dots (and an extra collect there), I would simply use the following: .filter(l -> !isPeriod(l.get(0))) where boolean isPeriod(Pos pos) { return ".".equals(pos.beforeChangement) || ".".equals(pos.afterChangement); }
@TomaszLinkowski your example is filter out whole list if first element is '.'-element
Now that I read the code more carefully I understood that the periods are not placed into separate lists. I got confused because I thought the predicate in groupRuns is (p1, p2) -> !isPeriod(p1) && !isPeriod(p2) while in fact it is (p1, p2) -> !isPeriod(p2). This is a bit strange condition, though, because it means the periods go at the beginnings of the lists. However, you're right that my filtering proposal wouldn't work. Instead, I would change the predicate to (p1, p2) -> !isPeriod(p1), and then removed the last element from each list using peek if it matched isPeriod.
1

Well, I would be conservative here, and I wouldn't use Streams (although it's possible).

The following snippet does what you need:

List<Pos> posList; List<List<Pos>> result = new ArrayList<>(); boolean startNewSentence = true; for (Pos pos : posList) { if (startNewSentence) { result.add(new ArrayList<>()); } startNewSentence = isPeriod(pos); if (!startNewSentence) { result.get(result.size() - 1).add(pos); } } 

where:

boolean isPeriod(Pos pos) { return ".".equals(pos.beforeChangement()) || ".".equals(pos.afterChangement()); } 

PS. Note there's no such word as "changement" in English. The noun from verb "change" is also "change".

2 Comments

Thank you Tomasz. this logic can solve my problem +1
@Dr.Mza I updated the code so that the elements with periods are not included in the result.
0

Collectors.groupingBy() may help you.

Comments

0

Let's say your object name for the list is SOP object is listSOP. Then

List<SOP> listSOP = new ArrayList<>(); .... populate your list. Map<String,List<SOP>> map = listSOP.stream().collect(Collectors.groupingBy(SOP::getBeforeChangement) 

This should return a Map of type <String(BeforeChangement), List<SOP>>.

Here getBeforeChangement is the getter method in your SOP class which should return value of variable beforeChangement

1 Comment

This will not work: it will group together all the SOPs with the same beforeChangement instead of partitioning the original list into ordered sublists.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.