#How to early break process block without terminating the whole pipeline

46 messages · Page 1 of 1 (latest)

true briar
#

I want to selectively pass certain count of items in the pipeline, terminate the process block when the count is reached to avoid redundant spinning on unwanted items, like the following function:

function first5 {
    begin {
        $count = 0
    }

    process {
        if (++$count -gt 5) {
            break # ??
        }
        $_
    }
}

The result is fine but measure doesn't seem to be able to capture them, does it mean the pipeline is terminated so measure received nothing?

1..10 | first5 # 1..5
1..10 | first5 | measure # none
ornate lotus
#

In other words, you cannot terminate the process block with a break and expect the pipeline to survive.

#

The pipeline exists up until the moment you break, however:

  function first5break {
      begin {
          $count = 0
      }

      process {
          if (++$count -gt 5) {
              break ; # send nothing to the pipeline
          } else {
              $_
          }
      }
  }
  function show {  process { write-host $_ } }

  'asdlfj','asdfsklj','alskjf',44,'sldkj',45,9,'owjjs' | first5break | show
asdlfj
asdfsklj
alskjf
44
sldkj
true briar
# ornate lotus The pipeline exists up until the moment you break, however: ```pwsh function f...

You're right. So it feels like measure's problem? But I also encountered it when piping to my custom function. I've checked with Trace-Command -Name ParameterBinding -Expression { 1..10 | first5 | measure } -PSHost binding looks fine. for example:

function collect {
    begin {
        $foo = [System.Collections.Generic.List[object]]@()
    }
    process {
        $foo.Add($_)
    }
    end {
        $foo
    }
}

1..10 | first5 | collect # none
1..10 | select -f 5 | collect # 1..5
manic halo
#

don't break. That's not it's intended use and it will try and find a loop to kill. That it stops a function if there's no loop is luck but still leaves you with a terrible gotcha.

devout ruin
# true briar You're right. So it feels like `measure`'s problem? But I also encountered it w...

Process statements are not a control flow block where one uses break

It's actually a function. So to skip logic, a return works.

function First5 { 
   param( 
      [Alias('Num')][int] $Count = 5
   )
   begin { $num = 0 }
   process { 
      if( $num++ -gt $Count ) {
            return      
      }
      $_
   }
} 
 
> 0..0x10ffff | First5 3
# 0, 2, 3, 3

Note, the function is called N times, but, exiting early like this is probably good enough for speed.

Measure-Command -Expression { 
 & { 
    0..0x10ffff | First5 3 | Join-String -sep ', '  | write-host
 }
}
# ~240 ms
#

It fixes the piping issue

function Csv { 
   param( 
      [string] $Delim = ', '
   )
   begin {  [Collections.Generic.List[object]] $list = @() }
   process {
       $list.Add( $_ )  
   }
   end { $list -join $delim }
} 
true briar
#

yeah I know return but that still spins, so there just no way to skip the redundant iterations? that's sad.

#

and measure can't receive items while select can, shouldn't it considered as bug or something?

twin dune
#

There’s no public API to do so. The closest thing is adding on Select-Object -First 1 to break on the first output but that doesn’t work or help in all situations. The trouble with such an API is that it breaks a lot of the logic that cmdlets put in the end block as they are no longer run. Best just be a good citizen and make the process block check some flag and return early

ornate lotus
drowsy flower
#

What? You can just throw [System.Management.Automation.PipelineStoppedException]::new()

twin dune
#

huh I thought it still ran the end block but TIL

drowsy flower
#

It does

#

I mean, it stops the pipeline cleanly

#

So, like:

& {
  begin { 1..5 }
  process { 10..15 }
  end { 20..25 }
} |
& { 
[CmdletBinding()]
param([Parameter(ValueFromPipeline)]$InputObject)
begin {
  $count = 0
}
process {
  $_  
  if ($count++ -eq 10) {
    throw [System.Management.Automation.PipelineStoppedException]::new()
  }
}
end {
"Ho, Ho"
}
} | Write-Host -ForegroundColor cyan
#

That would write the 1..5 and 10..15 and then stop

#

No end

#

If you use the Select -First does it end?

#

I was thinking that PipelineStopped would run the End even though it stops, but it isn't.

twin dune
#

It works like Select -First which I did not think was possible without reflection. It’ll all skip end and is the reason why the clean block was added

drowsy flower
#

no, see ...

#

Select -First would stop the INPUT, but then the end block still runs, so the pipeline stops clean

#

This didn't

twin dune
#

I was under the assumption Select -First does the same thing and stops all subsequent blocks regardless of where they were

drowsy flower
#

try this:

& {
  begin { 1..5 }
  process { 10..15 }
  end { 20..25 }
} |
Select -First 10 |
& {
begin { "Hello" }
process { $_ }
end { "Goodbye" }
}
twin dune
#

I’m on my phone sorry

#

I assume the end on the first cmdlet won’t run

#

But the subsequent Goodbye one will

drowsy flower
#

So that will output Hello,1,2,3,4,5,10,11,12,13,15,Goodbye

#

Right. the pipeline BEFORE Select gets stopped, but everything after finishes.

twin dune
#

If you add [Console]::WriteLine('end') to the first one I expect it doesn’t run at all due to Select stopping that cmdlet after the First finished

#

Sure because -First ends what was running before but the output it has is still valid and it has no clue what it was for

drowsy flower
#

So the PipelineStoppedException ... doesn't do that

#

It just stops everything

twin dune
#

I see what you mean now sorry

#

To me this is all the more reason to just have a flag and return early. It’s hard to reason what such an API would do and wouldn’t pass the 2am test

drowsy flower
#

I mean, the exception in question StopUpstreamCommandsException is pretty weird.

#

The semantics of stopping is intended to mimic a user pressing Ctrl-C [but which only affects upstream cmdlets].

#

You can also just run continue which does almost the same as the StopPipelineException in this case.

#

(which will explain why not to use the continue)

twin dune
#

Even if someone wrote it there’s no chance it’ll be merged. Too many open questions around behaviour and with RFCs being effectively dead it needs a champion in the team to really push it through